|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v13] x86/emulate: Send vm_event from emulate
> -----Original Message-----
> From: Alexandru Stefan ISAILA <aisaila@xxxxxxxxxxxxxxx>
> Sent: 23 September 2019 13:06
> To: xen-devel@xxxxxxxxxxxxxxxxxxxx
> Cc: Paul Durrant <Paul.Durrant@xxxxxxxxxx>; jbeulich@xxxxxxxx; Andrew Cooper
> <Andrew.Cooper3@xxxxxxxxxx>; wl@xxxxxxx; Roger Pau Monne
> <roger.pau@xxxxxxxxxx>; Razvan COJOCARU
> <rcojocaru@xxxxxxxxxxxxxxx>; tamas@xxxxxxxxxxxxx; Alexandru Stefan ISAILA
> <aisaila@xxxxxxxxxxxxxxx>;
> Petre Ovidiu PIRCALABU <ppircalabu@xxxxxxxxxxxxxxx>; George Dunlap
> <George.Dunlap@xxxxxxxxxx>
> Subject: [PATCH v13] x86/emulate: Send vm_event from emulate
>
> A/D bit writes (on page walks) can be considered benign by an introspection
> agent, so receiving vm_events for them is a pessimization. We try here to
> optimize by filtering these events out.
> Currently, we are fully emulating the instruction at RIP when the hardware
> sees
> an EPT fault with npfec.kind != npfec_kind_with_gla. This is, however,
> incorrect, because the instruction at RIP might legitimately cause an
> EPT fault of its own while accessing a _different_ page from the original one,
> where A/D were set.
> The solution is to perform the whole emulation, while ignoring EPT
> restrictions
> for the walk part, and taking them into account for the "actual" emulating of
> the instruction at RIP. When we send out a vm_event, we don't want the
> emulation
> to complete, since in that case we won't be able to veto whatever it is doing.
> That would mean that we can't actually prevent any malicious activity, instead
> we'd only be able to report on it.
> When we see a "send-vm_event" case while emulating, we need to first send the
> event out and then suspend the emulation (return X86EMUL_RETRY).
> After the emulation stops we'll call hvm_vm_event_do_resume() again after the
> introspection agent treats the event and resumes the guest. There, the
> instruction at RIP will be fully emulated (with the EPT ignored) if the
> introspection application allows it, and the guest will continue to run past
> the instruction.
>
> A common example is if the hardware exits because of an EPT fault caused by a
> page walk, p2m_mem_access_check() decides if it is going to send a vm_event.
> If the vm_event was sent and it would be treated so it runs the instruction
> at RIP, that instruction might also hit a protected page and provoke a
> vm_event.
>
> Now if npfec.kind == npfec_kind_in_gpt and
> d->arch.monitor.inguest_pagefault_disabled
> is true then we are in the page walk case and we can do this emulation
> optimization
> and emulate the page walk while ignoring the EPT, but don't ignore the EPT
> for the
> emulation of the actual instruction.
>
> In the first case we would have 2 EPT events, in the second case we would have
> 1 EPT event if the instruction at the RIP triggers an EPT event.
>
> We use hvmemul_map_linear_addr() to intercept write access and
> __hvm_copy() to intercept exec, read and write access.
>
> A new return type was added, HVMTRANS_need_retry, in order to have all
> the places that consume HVMTRANS* return X86EMUL_RETRY.
>
> hvm_emulate_send_vm_event() can return false if there was no violation,
> if there was an error from monitor_traps() or p2m_get_mem_access().
> -ESRCH from p2m_get_mem_access() is treated as restricted access.
>
> NOTE: hvm_emulate_send_vm_event() assumes the caller will enable/disable
> arch.vm_event->send_event
>
> Signed-off-by: Alexandru Isaila <aisaila@xxxxxxxxxxxxxxx>
>
emulate parts...
Acked-by: Paul Durrant <paul@xxxxxxx>
> ---
> Changes since V12:
> - Update commit message
> - Rework switch() in map_linear_addr()/rep_movs()/rep_stos().
> ---
> xen/arch/x86/hvm/emulate.c | 57 +++++++++++++++++-----
> xen/arch/x86/hvm/hvm.c | 9 ++++
> xen/arch/x86/hvm/intercept.c | 2 +
> xen/arch/x86/hvm/monitor.c | 78 +++++++++++++++++++++++++++++++
> xen/arch/x86/mm/mem_access.c | 9 +++-
> xen/arch/x86/mm/shadow/hvm.c | 1 +
> xen/include/asm-x86/hvm/monitor.h | 3 ++
> xen/include/asm-x86/hvm/support.h | 1 +
> xen/include/asm-x86/vm_event.h | 2 +
> 9 files changed, 149 insertions(+), 13 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 36bcb526d3..85ab009717 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -548,6 +548,7 @@ static void *hvmemul_map_linear_addr(
> unsigned int nr_frames = ((linear + bytes - !!bytes) >> PAGE_SHIFT) -
> (linear >> PAGE_SHIFT) + 1;
> unsigned int i;
> + gfn_t gfn;
>
> /*
> * mfn points to the next free slot. All used slots have a page
> reference
> @@ -582,7 +583,7 @@ static void *hvmemul_map_linear_addr(
> ASSERT(mfn_x(*mfn) == 0);
>
> res = hvm_translate_get_page(curr, addr, true, pfec,
> - &pfinfo, &page, NULL, &p2mt);
> + &pfinfo, &page, &gfn, &p2mt);
>
> switch ( res )
> {
> @@ -599,8 +600,15 @@ static void *hvmemul_map_linear_addr(
> err = NULL;
> goto out;
>
> - case HVMTRANS_gfn_paged_out:
> + case HVMTRANS_need_retry:
> + /*
> + * hvm_translate_get_page() does not return HVMTRANS_need_retry.
> + * It can dropped if future work requires this.
> + */
> + ASSERT_UNREACHABLE();
> + /* fall through */
> case HVMTRANS_gfn_shared:
> + case HVMTRANS_gfn_paged_out:
> err = ERR_PTR(~X86EMUL_RETRY);
> goto out;
>
> @@ -626,6 +634,14 @@ static void *hvmemul_map_linear_addr(
>
> ASSERT(p2mt == p2m_ram_logdirty || !p2m_is_readonly(p2mt));
> }
> +
> + if ( unlikely(curr->arch.vm_event) &&
> + curr->arch.vm_event->send_event &&
> + hvm_monitor_check_p2m(addr, gfn, pfec, npfec_kind_with_gla) )
> + {
> + err = ERR_PTR(~X86EMUL_RETRY);
> + goto out;
> + }
> }
>
> /* Entire access within a single frame? */
> @@ -1141,6 +1157,7 @@ static int linear_read(unsigned long addr, unsigned int
> bytes, void *p_data,
>
> case HVMTRANS_gfn_paged_out:
> case HVMTRANS_gfn_shared:
> + case HVMTRANS_need_retry:
> return X86EMUL_RETRY;
> }
>
> @@ -1192,6 +1209,7 @@ static int linear_write(unsigned long addr, unsigned
> int bytes, void *p_data,
>
> case HVMTRANS_gfn_paged_out:
> case HVMTRANS_gfn_shared:
> + case HVMTRANS_need_retry:
> return X86EMUL_RETRY;
> }
>
> @@ -1852,19 +1870,27 @@ static int hvmemul_rep_movs(
>
> xfree(buf);
>
> - if ( rc == HVMTRANS_gfn_paged_out )
> - return X86EMUL_RETRY;
> - if ( rc == HVMTRANS_gfn_shared )
> - return X86EMUL_RETRY;
> - if ( rc != HVMTRANS_okay )
> + switch ( rc )
> {
> - gdprintk(XENLOG_WARNING, "Failed memory-to-memory REP MOVS: sgpa=%"
> - PRIpaddr" dgpa=%"PRIpaddr" reps=%lu bytes_per_rep=%u\n",
> - sgpa, dgpa, *reps, bytes_per_rep);
> - return X86EMUL_UNHANDLEABLE;
> + case HVMTRANS_need_retry:
> + /*
> + * hvm_copy_to_guest_phys() does not return HVMTRANS_need_retry.
> + * It can dropped if future work requires this.
> + */
> + ASSERT_UNREACHABLE();
> + /* fall through */
> + case HVMTRANS_gfn_paged_out:
> + case HVMTRANS_gfn_shared:
> + return X86EMUL_RETRY;
> + case HVMTRANS_okay:
> + return X86EMUL_OKAY;
> }
>
> - return X86EMUL_OKAY;
> + gdprintk(XENLOG_WARNING, "Failed memory-to-memory REP MOVS: sgpa=%"
> + PRIpaddr" dgpa=%"PRIpaddr" reps=%lu bytes_per_rep=%u\n",
> + sgpa, dgpa, *reps, bytes_per_rep);
> +
> + return X86EMUL_UNHANDLEABLE;
> }
>
> static int hvmemul_rep_stos(
> @@ -1966,6 +1992,13 @@ static int hvmemul_rep_stos(
>
> switch ( rc )
> {
> + case HVMTRANS_need_retry:
> + /*
> + * hvm_copy_to_guest_phys() does not return HVMTRANS_need_retry.
> + * It can dropped if future work requires this.
> + */
> + ASSERT_UNREACHABLE();
> + /* fall through */
> case HVMTRANS_gfn_paged_out:
> case HVMTRANS_gfn_shared:
> return X86EMUL_RETRY;
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index fdb1e17f59..c82e7b2cd3 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -3236,6 +3236,15 @@ static enum hvm_translation_result __hvm_copy(
> return HVMTRANS_bad_gfn_to_mfn;
> }
>
> + if ( unlikely(v->arch.vm_event) &&
> + (flags & HVMCOPY_linear) &&
> + v->arch.vm_event->send_event &&
> + hvm_monitor_check_p2m(addr, gfn, pfec, npfec_kind_with_gla) )
> + {
> + put_page(page);
> + return HVMTRANS_need_retry;
> + }
> +
> p = (char *)__map_domain_page(page) + (addr & ~PAGE_MASK);
>
> if ( flags & HVMCOPY_to_guest )
> diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
> index aac22c595d..90202bdcec 100644
> --- a/xen/arch/x86/hvm/intercept.c
> +++ b/xen/arch/x86/hvm/intercept.c
> @@ -145,6 +145,7 @@ int hvm_process_io_intercept(const struct hvm_io_handler
> *handler,
> case HVMTRANS_bad_linear_to_gfn:
> case HVMTRANS_gfn_paged_out:
> case HVMTRANS_gfn_shared:
> + case HVMTRANS_need_retry:
> ASSERT_UNREACHABLE();
> /* fall through */
> default:
> @@ -174,6 +175,7 @@ int hvm_process_io_intercept(const struct hvm_io_handler
> *handler,
> case HVMTRANS_bad_linear_to_gfn:
> case HVMTRANS_gfn_paged_out:
> case HVMTRANS_gfn_shared:
> + case HVMTRANS_need_retry:
> ASSERT_UNREACHABLE();
> /* fall through */
> default:
> diff --git a/xen/arch/x86/hvm/monitor.c b/xen/arch/x86/hvm/monitor.c
> index 2a41ccc930..7fb1e2c04e 100644
> --- a/xen/arch/x86/hvm/monitor.c
> +++ b/xen/arch/x86/hvm/monitor.c
> @@ -23,8 +23,10 @@
> */
>
> #include <xen/vm_event.h>
> +#include <xen/mem_access.h>
> #include <xen/monitor.h>
> #include <asm/hvm/monitor.h>
> +#include <asm/altp2m.h>
> #include <asm/monitor.h>
> #include <asm/paging.h>
> #include <asm/vm_event.h>
> @@ -215,6 +217,82 @@ void hvm_monitor_interrupt(unsigned int vector, unsigned
> int type,
> monitor_traps(current, 1, &req);
> }
>
> +/*
> + * Send memory access vm_events based on pfec. Returns true if the event was
> + * sent and false for p2m_get_mem_access() error, no violation and event send
> + * error. Assumes the caller will enable/disable arch.vm_event->send_event.
> + */
> +bool hvm_monitor_check_p2m(unsigned long gla, gfn_t gfn, uint32_t pfec,
> + uint16_t kind)
> +{
> + xenmem_access_t access;
> + struct vcpu *curr = current;
> + vm_event_request_t req = {};
> + paddr_t gpa = (gfn_to_gaddr(gfn) | (gla & ~PAGE_MASK));
> + int rc;
> +
> + ASSERT(curr->arch.vm_event->send_event);
> +
> + /*
> + * p2m_get_mem_access() can fail from a invalid MFN and return -ESRCH
> + * in which case access must be restricted.
> + */
> + rc = p2m_get_mem_access(curr->domain, gfn, &access,
> altp2m_vcpu_idx(curr));
> +
> + if ( rc == -ESRCH )
> + access = XENMEM_access_n;
> + else if ( rc )
> + return false;
> +
> + switch ( access )
> + {
> + case XENMEM_access_x:
> + case XENMEM_access_rx:
> + if ( pfec & PFEC_write_access )
> + req.u.mem_access.flags = MEM_ACCESS_R | MEM_ACCESS_W;
> + break;
> +
> + case XENMEM_access_w:
> + case XENMEM_access_rw:
> + if ( pfec & PFEC_insn_fetch )
> + req.u.mem_access.flags = MEM_ACCESS_X;
> + break;
> +
> + case XENMEM_access_r:
> + case XENMEM_access_n:
> + if ( pfec & PFEC_write_access )
> + req.u.mem_access.flags |= MEM_ACCESS_R | MEM_ACCESS_W;
> + if ( pfec & PFEC_insn_fetch )
> + req.u.mem_access.flags |= MEM_ACCESS_X;
> + break;
> +
> + case XENMEM_access_wx:
> + case XENMEM_access_rwx:
> + case XENMEM_access_rx2rw:
> + case XENMEM_access_n2rwx:
> + case XENMEM_access_default:
> + break;
> + }
> +
> + if ( !req.u.mem_access.flags )
> + return false; /* no violation */
> +
> + if ( kind == npfec_kind_with_gla )
> + req.u.mem_access.flags |= MEM_ACCESS_FAULT_WITH_GLA |
> + MEM_ACCESS_GLA_VALID;
> + else if ( kind == npfec_kind_in_gpt )
> + req.u.mem_access.flags |= MEM_ACCESS_FAULT_IN_GPT |
> + MEM_ACCESS_GLA_VALID;
> +
> +
> + req.reason = VM_EVENT_REASON_MEM_ACCESS;
> + req.u.mem_access.gfn = gfn_x(gfn);
> + req.u.mem_access.gla = gla;
> + req.u.mem_access.offset = gpa & ~PAGE_MASK;
> +
> + return monitor_traps(curr, true, &req) >= 0;
> +}
> +
> /*
> * Local variables:
> * mode: C
> diff --git a/xen/arch/x86/mm/mem_access.c b/xen/arch/x86/mm/mem_access.c
> index 0144f92b98..320b9fe621 100644
> --- a/xen/arch/x86/mm/mem_access.c
> +++ b/xen/arch/x86/mm/mem_access.c
> @@ -210,11 +210,18 @@ bool p2m_mem_access_check(paddr_t gpa, unsigned long
> gla,
> return true;
> }
> }
> +
> + /*
> + * Try to avoid sending a mem event. Suppress events caused by page-walks
> + * by emulating but still checking mem_access violations.
> + */
> if ( vm_event_check_ring(d->vm_event_monitor) &&
> d->arch.monitor.inguest_pagefault_disabled &&
> - npfec.kind != npfec_kind_with_gla ) /* don't send a mem_event */
> + npfec.kind == npfec_kind_in_gpt )
> {
> + v->arch.vm_event->send_event = true;
> hvm_emulate_one_vm_event(EMUL_KIND_NORMAL, TRAP_invalid_op,
> X86_EVENT_NO_EC);
> + v->arch.vm_event->send_event = false;
>
> return true;
> }
> diff --git a/xen/arch/x86/mm/shadow/hvm.c b/xen/arch/x86/mm/shadow/hvm.c
> index 0aa560b7f5..48dfad4557 100644
> --- a/xen/arch/x86/mm/shadow/hvm.c
> +++ b/xen/arch/x86/mm/shadow/hvm.c
> @@ -139,6 +139,7 @@ hvm_read(enum x86_segment seg,
> return X86EMUL_UNHANDLEABLE;
> case HVMTRANS_gfn_paged_out:
> case HVMTRANS_gfn_shared:
> + case HVMTRANS_need_retry:
> return X86EMUL_RETRY;
> }
>
> diff --git a/xen/include/asm-x86/hvm/monitor.h
> b/xen/include/asm-x86/hvm/monitor.h
> index f1af4f812a..325b44674d 100644
> --- a/xen/include/asm-x86/hvm/monitor.h
> +++ b/xen/include/asm-x86/hvm/monitor.h
> @@ -49,6 +49,9 @@ void hvm_monitor_interrupt(unsigned int vector, unsigned
> int type,
> unsigned int err, uint64_t cr2);
> bool hvm_monitor_emul_unimplemented(void);
>
> +bool hvm_monitor_check_p2m(unsigned long gla, gfn_t gfn, uint32_t pfec,
> + uint16_t kind);
> +
> #endif /* __ASM_X86_HVM_MONITOR_H__ */
>
> /*
> diff --git a/xen/include/asm-x86/hvm/support.h
> b/xen/include/asm-x86/hvm/support.h
> index e989aa7349..1500e6c94b 100644
> --- a/xen/include/asm-x86/hvm/support.h
> +++ b/xen/include/asm-x86/hvm/support.h
> @@ -61,6 +61,7 @@ enum hvm_translation_result {
> HVMTRANS_unhandleable,
> HVMTRANS_gfn_paged_out,
> HVMTRANS_gfn_shared,
> + HVMTRANS_need_retry,
> };
>
> /*
> diff --git a/xen/include/asm-x86/vm_event.h b/xen/include/asm-x86/vm_event.h
> index 23e655710b..66db9e1e25 100644
> --- a/xen/include/asm-x86/vm_event.h
> +++ b/xen/include/asm-x86/vm_event.h
> @@ -36,6 +36,8 @@ struct arch_vm_event {
> bool set_gprs;
> /* A sync vm_event has been sent and we're not done handling it. */
> bool sync_event;
> + /* Send mem access events from emulator */
> + bool send_event;
> };
>
> int vm_event_init_domain(struct domain *d);
> --
> 2.17.1
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |