[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] x86/mm: Suppresses vm_events caused by page-walks



On 02/24/2018 12:31 AM, Tamas K Lengyel wrote:
> On Fri, Feb 23, 2018 at 3:25 PM, Razvan Cojocaru
> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>> On 02/24/2018 12:06 AM, Tamas K Lengyel wrote:
>>> On Mon, Jan 8, 2018 at 5:49 AM, Alexandru Isaila
>>> <aisaila@xxxxxxxxxxxxxxx> wrote:
>>>> This patch is adding a way to enable/disable nested pagefault
>>>> events. It introduces the xc_monitor_nested_pagefault function
>>>> and adds the nested_pagefault_disabled in the monitor structure.
>>>> This is needed by the introspection so it will only get gla
>>>> faults and not get spammed with other faults.
>>>> In p2m_set_ad_bits the v->arch.sse_pg_dirty.eip and
>>>> v->arch.sse_pg_dirty.gla are used to mark that this is the
>>>> second time a fault occurs and the dirty bit is set.
>>>>
>>>> Signed-off-by: Alexandru Isaila <aisaila@xxxxxxxxxxxxxxx>
>>>>
>>>> ---
>>>> Changes since V1:
>>>>         - Rb V1
>>>>         - Add comment in domctl.h
>>>> ---
>>>>  tools/libxc/include/xenctrl.h |  2 ++
>>>>  tools/libxc/xc_monitor.c      | 14 ++++++++++++++
>>>>  xen/arch/x86/mm/mem_access.c  | 27 +++++++++++++++++++++++++++
>>>>  xen/arch/x86/monitor.c        | 13 +++++++++++++
>>>>  xen/include/asm-x86/domain.h  |  6 ++++++
>>>>  xen/include/asm-x86/monitor.h |  3 ++-
>>>>  xen/include/public/domctl.h   |  2 ++
>>>>  7 files changed, 66 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
>>>> index 09e1363..112c974 100644
>>>> --- a/tools/libxc/include/xenctrl.h
>>>> +++ b/tools/libxc/include/xenctrl.h
>>>> @@ -2056,6 +2056,8 @@ int xc_monitor_descriptor_access(xc_interface *xch, 
>>>> uint32_t domain_id,
>>>>                                   bool enable);
>>>>  int xc_monitor_guest_request(xc_interface *xch, uint32_t domain_id,
>>>>                               bool enable, bool sync, bool 
>>>> allow_userspace);
>>>> +int xc_monitor_nested_pagefault(xc_interface *xch, uint32_t domain_id,
>>>> +                                bool disable);
>>>>  int xc_monitor_debug_exceptions(xc_interface *xch, uint32_t domain_id,
>>>>                                  bool enable, bool sync);
>>>>  int xc_monitor_cpuid(xc_interface *xch, uint32_t domain_id, bool enable);
>>>> diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c
>>>> index 0233b87..e96c56d 100644
>>>> --- a/tools/libxc/xc_monitor.c
>>>> +++ b/tools/libxc/xc_monitor.c
>>>> @@ -163,6 +163,20 @@ int xc_monitor_guest_request(xc_interface *xch, 
>>>> uint32_t domain_id, bool enable,
>>>>      return do_domctl(xch, &domctl);
>>>>  }
>>>>
>>>> +int xc_monitor_nested_pagefault(xc_interface *xch, uint32_t domain_id,
>>>> +                                bool disable)
>>>> +{
>>>> +    DECLARE_DOMCTL;
>>>> +
>>>> +    domctl.cmd = XEN_DOMCTL_monitor_op;
>>>> +    domctl.domain = domain_id;
>>>> +    domctl.u.monitor_op.op = disable ? XEN_DOMCTL_MONITOR_OP_ENABLE
>>>> +                                    : XEN_DOMCTL_MONITOR_OP_DISABLE;
>>>> +    domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_NESTED_PAGEFAULT;
>>>> +
>>>> +    return do_domctl(xch, &domctl);
>>>> +}
>>>> +
>>>>  int xc_monitor_emulate_each_rep(xc_interface *xch, uint32_t domain_id,
>>>>                                  bool enable)
>>>>  {
>>>> diff --git a/xen/arch/x86/mm/mem_access.c b/xen/arch/x86/mm/mem_access.c
>>>> index c0cd017..07a334b 100644
>>>> --- a/xen/arch/x86/mm/mem_access.c
>>>> +++ b/xen/arch/x86/mm/mem_access.c
>>>> @@ -137,6 +137,23 @@ bool p2m_mem_access_emulate_check(struct vcpu *v,
>>>>      return violation;
>>>>  }
>>>>
>>>> +static void p2m_set_ad_bits(struct vcpu *v, paddr_t ga)
>>>> +{
>>>> +    struct hvm_hw_cpu ctxt;
>>>> +    uint32_t pfec = 0;
>>>> +
>>>> +    hvm_funcs.save_cpu_ctxt(v, &ctxt);
>>>> +
>>>> +    if ( guest_cpu_user_regs()->eip == v->arch.pg_dirty.eip
>>>> +         && ga == v->arch.pg_dirty.gla )
>>>> +        pfec = PFEC_write_access;
>>>> +
>>>> +    paging_ga_to_gfn_cr3(v, ctxt.cr3, ga, &pfec, NULL);
>>>> +
>>>> +    v->arch.pg_dirty.eip = guest_cpu_user_regs()->eip;
>>>> +    v->arch.pg_dirty.gla = ga;
>>>> +}
>>>> +
>>>>  bool p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>>>>                            struct npfec npfec,
>>>>                            vm_event_request_t **req_ptr)
>>>> @@ -208,6 +225,16 @@ bool p2m_mem_access_check(paddr_t gpa, unsigned long 
>>>> gla,
>>>>          }
>>>>      }
>>>>
>>>> +    if ( vm_event_check_ring(d->vm_event_monitor) &&
>>>> +         d->arch.monitor.nested_pagefault_disabled &&
>>>> +         npfec.kind != npfec_kind_with_gla ) /* don't send a mem_event */
>>>> +    {
>>>> +        v->arch.vm_event->emulate_flags = 0;
>>>> +        p2m_set_ad_bits(v, gla);
>>>> +
>>>> +        return true;
>>>> +    }
>>>> +
>>>>      *req_ptr = NULL;
>>>>      req = xzalloc(vm_event_request_t);
>>>>      if ( req )
>>>> diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c
>>>> index f229e69..e35b619 100644
>>>> --- a/xen/arch/x86/monitor.c
>>>> +++ b/xen/arch/x86/monitor.c
>>>> @@ -241,6 +241,19 @@ int arch_monitor_domctl_event(struct domain *d,
>>>>          break;
>>>>      }
>>>>
>>>> +    case XEN_DOMCTL_MONITOR_EVENT_NESTED_PAGEFAULT:
>>>> +    {
>>>> +        bool old_status = ad->monitor.nested_pagefault_disabled;
>>>> +
>>>> +        if ( unlikely(old_status == requested_status) )
>>>> +            return -EEXIST;
>>>> +
>>>> +        domain_pause(d);
>>>> +        ad->monitor.nested_pagefault_disabled = requested_status;
>>>> +        domain_unpause(d);
>>>> +        break;
>>>> +    }
>>>> +
>>>>      case XEN_DOMCTL_MONITOR_EVENT_DESC_ACCESS:
>>>>      {
>>>>          bool old_status = ad->monitor.descriptor_access_enabled;
>>>> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
>>>> index 4679d54..099af7c 100644
>>>> --- a/xen/include/asm-x86/domain.h
>>>> +++ b/xen/include/asm-x86/domain.h
>>>> @@ -412,6 +412,7 @@ struct arch_domain
>>>>          unsigned int descriptor_access_enabled                            
>>>>  : 1;
>>>>          unsigned int guest_request_userspace_enabled                      
>>>>  : 1;
>>>>          unsigned int emul_unimplemented_enabled                           
>>>>  : 1;
>>>> +        unsigned int nested_pagefault_disabled                            
>>>>  : 1;
>>>
>>> All other options are "_enabled" here, so adding one that's flipped
>>> just looks out of place. Any objections to making this match the rest?
>>> Also, naming it "nested" just makes me think this is somehow would be
>>> related to nested virtualization, but that's not the case. These would
>>> be just regular pagefaults in the guest, so naming the monitor option
>>> simply "pagefault" would look better to me in general.
>> Hello Tamas,
>>
>> Here's the thinking behind preferring "disabled" to "enabled": we want
>> to keep the default behaviour as it is currently, and the current
>> behaviour is to send out _all_ EPT fault vm_events (caused by page walks
>> or not).
>>
>> Now, struct arch_domain is being zeroed out on init, so if we name this
>> "enabled", then that's the behaviour we're starting out with. We have no
>> problem with that, but it changes the current default behaviour.
> 
> We can keep the "disabled" naming but then please add a comment to the field
> saying that by default all events are sent, this is used to filter
> pagefaults out.

Will do, no problem.

>> So either we name this new field "disabled", or we rename it to
>> "enabled" (if we rename it, we either need to set it as a special case
>> on init, or modify the default behaviour to be _not_ sending out
>> page-walk-caused EPT events).
>>
>> If you feel strongly about options 2.A or 2.C we don't have a problem
>> changing the code.
>>
>> About "pagefault", it reads more confusing to me, since all EPT-related
>> vm_events are basically page faults. But maybe that's just me.
> 
> True. It's just confusing with "nested" also having multiple meanings.
> Perhaps "inguest_pagefaults"?

Sure, if nobody else objects, we'll change "nested" to "inguest".


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.