Xen project Mailing List

RE: [PATCH v3] x86/HVM: more consistently set I/O completion

To: Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>

Date: Tue, 15 Sep 2020 02:46:56 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=o9RoeKkXnjMdP5pUerymgSTlfDTlaLTpbjLcuV6GwMg=; b=XyBbKh0/NDj0MzThKIq/PnXCckQPTxvR7zhvjHPqvFFEd2C0ujVN8c7zxToGbrUd4DPQgJyjnOuFDUKlDKY8yI0wAn5kBP2rz3RijGiHTiHpsl16fGl1wePniMVQJ7XfBvjIMwSEF+orW4wOpRumwH/iNpzPQLkUljxDwIJhGjy5cRg/k8Ibm6h6fGMm03oe1WLq8CPNWKJrOIsQsQ/KLPGrVcf58ImSFyWRHtUl10PtMhweUGz3hchWutHG2HZr1P+OUGToBhK1b6Ird36n2/6hjxxX/aBoeI6f/q8GZ/e/JlfuJjc3l/J24XnUc2cXR60AexXBhCB0ldCNQsbgoQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mXjZnt8EmGVyXjBnBAdoUp3LeKs3PCBbxFKfPePaLqiTXis7RMoeclW5Uw4AzWpLFDi9ZpQoYuhUWisDM3aBCbc1Blaq3I75rewntzTJHdlM0GNVepQn9lBOflP0OMXliBvJGhXvyHniadcKdJg4LDsfk4fj5vDlPzuvIQc/rNlPFE6ZAiPRzEEAYlfKDX1rbk2bcVV/Uf2stkS36O3ibxuHkK8TLZvsB3EcgD4F7mFSNP0Bx/ScN7CL+/R7NmfDR+DT0xx1jPnNTO64Tll8wIeS4MHweYJCJ9dJdiRpWuRysiLsaJCs6aWqysemaQnJcLtmDnk6jfqd8mX0lmiTng==

Authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=intel.com;

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>

Delivery-date: Tue, 15 Sep 2020 02:47:39 +0000

Dlp-product: dlpe-windows

Dlp-reaction: no-action

Dlp-version: 11.5.1.3

Ironport-sdr: D363Z6kvcyrZvzlE8oFSj6Y9FXewZrLhgs0cMC/2TEXVWZ26wNJw9WE+a8RHNUCguHJdiD3dBA LRwXMLdBApUw==

Ironport-sdr: d0LVPXgBhQw3FYyRizJS4WlRLPCFUyfwCxhyjpEUPF+EhmsDPttf2n1ZN34bR/x+sXn9judzqn jX671hLwI0sQ==

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHWfED8hfJeD/xVJUGp+IjmSPMwmalpHE0g

Thread-topic: [PATCH v3] x86/HVM: more consistently set I/O completion

> From: Jan Beulich <jbeulich@xxxxxxxx> > Sent: Thursday, August 27, 2020 3:09 PM > > Doing this just in hvm_emulate_one_insn() is not enough. > hvm_ud_intercept() and hvm_emulate_one_vm_event() can get invoked for > insns requiring one or more continuations, and at least in principle > hvm_emulate_one_mmio() could, too. Without proper setting of the field, > handle_hvm_io_completion() will do nothing completion-wise, and in > particular the missing re-invocation of the insn emulation paths will > lead to emulation caching not getting disabled in due course, causing > the ASSERT() in {svm,vmx}_vmenter_helper() to trigger. > > Reported-by: Don Slutz <don.slutz@xxxxxxxxx> > > Similar considerations go for the clearing of vio->mmio_access, which > gets moved as well. > > Additionally all updating of vio->mmio_* now gets done dependent upon > the new completion value, rather than hvm_ioreq_needs_completion()'s > return value. This is because it is the completion chosen which controls > what path will be taken when handling the completion, not the simple > boolean return value. In particular, PIO completion doesn't involve > going through the insn emulator, and hence emulator state ought to get > cleared early (or it won't get cleared at all). > > The new logic, besides allowing for a caller override for the > continuation type to be set (for VMX real mode emulation), will also > avoid setting an MMIO completion when a simpler PIO one will do. This > is a minor optimization only as a side effect - the behavior is strictly > needed at least for hvm_ud_intercept(), as only memory accesses can > successfully complete through handle_mmio(). Care of course needs to be > taken to correctly deal with "mixed" insns (doing both MMIO and PIO at > the same time, i.e. INS/OUTS). For this, hvmemul_validate() now latches > whether the insn being emulated is a memory access, as this information > is no longer easily available at the point where we want to consume it. > > Note that the presence of non-NULL .validate fields in the two ops > structures in hvm_emulate_one_mmio() was really necessary even before > the changes here: Without this, passing non-NULL as middle argument to > hvm_emulate_init_once() is meaningless. > > The restrictions on when the #UD intercept gets actually enabled are why > it was decided that this is not a security issue: > - the "hvm_fep" option to enable its use is a debugging option only, > - for the cross-vendor case is considered experimental, even if > unfortunately SUPPORT.md doesn't have an explicit statement about > this. > The other two affected functions are > - hvm_emulate_one_vm_event(), used for introspection, > - hvm_emulate_one_mmio(), used for Dom0 only, > which aren't qualifying this as needing an XSA either. > > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > Tested-by: Don Slutz <don.slutz@xxxxxxxxx> Reviewed-by: Kevin Tian <kevin.tian@xxxxxxxxx> > --- > v3: Add comment ahead of _hvm_emulate_one(). Add parentheses in a > conditional expr. Justify why this does not need an XSA. > v2: Make updating of vio->mmio_* fields fully driven by the new > completion value. > --- > I further think that the entire tail of _hvm_emulate_one() (everything > past the code changed/added there by this patch) wants skipping in case > a completion is needed, at the very least for the mmio and realmode > cases, where we know we'll come back here. > > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -1683,9 +1683,11 @@ static int hvmemul_validate( > const struct x86_emulate_state *state, > struct x86_emulate_ctxt *ctxt) > { > - const struct hvm_emulate_ctxt *hvmemul_ctxt = > + struct hvm_emulate_ctxt *hvmemul_ctxt = > container_of(ctxt, struct hvm_emulate_ctxt, ctxt); > > + hvmemul_ctxt->is_mem_access = x86_insn_is_mem_access(state, ctxt); > + > return !hvmemul_ctxt->validate || hvmemul_ctxt->validate(state, ctxt) > ? X86EMUL_OKAY : X86EMUL_UNHANDLEABLE; > } > @@ -2610,8 +2612,13 @@ static const struct x86_emulate_ops hvm_ > .vmfunc = hvmemul_vmfunc, > }; > > +/* > + * Note that passing HVMIO_no_completion into this function serves as kind > + * of (but not fully) an "auto select completion" indicator. > + */ > static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt, > - const struct x86_emulate_ops *ops) > + const struct x86_emulate_ops *ops, > + enum hvm_io_completion completion) > { > const struct cpu_user_regs *regs = hvmemul_ctxt->ctxt.regs; > struct vcpu *curr = current; > @@ -2642,16 +2649,31 @@ static int _hvm_emulate_one(struct hvm_e > rc = X86EMUL_RETRY; > > if ( !hvm_ioreq_needs_completion(&vio->io_req) ) > + completion = HVMIO_no_completion; > + else if ( completion == HVMIO_no_completion ) > + completion = (vio->io_req.type != IOREQ_TYPE_PIO || > + hvmemul_ctxt->is_mem_access) ? HVMIO_mmio_completion > + : HVMIO_pio_completion; > + > + switch ( vio->io_completion = completion ) > { > + case HVMIO_no_completion: > + case HVMIO_pio_completion: > vio->mmio_cache_count = 0; > vio->mmio_insn_bytes = 0; > + vio->mmio_access = (struct npfec){}; > hvmemul_cache_disable(curr); > - } > - else > - { > + break; > + > + case HVMIO_mmio_completion: > + case HVMIO_realmode_completion: > BUILD_BUG_ON(sizeof(vio->mmio_insn) < sizeof(hvmemul_ctxt- > >insn_buf)); > vio->mmio_insn_bytes = hvmemul_ctxt->insn_buf_bytes; > memcpy(vio->mmio_insn, hvmemul_ctxt->insn_buf, vio- > >mmio_insn_bytes); > + break; > + > + default: > + ASSERT_UNREACHABLE(); > } > > if ( hvmemul_ctxt->ctxt.retire.singlestep ) > @@ -2692,9 +2714,10 @@ static int _hvm_emulate_one(struct hvm_e > } > > int hvm_emulate_one( > - struct hvm_emulate_ctxt *hvmemul_ctxt) > + struct hvm_emulate_ctxt *hvmemul_ctxt, > + enum hvm_io_completion completion) > { > - return _hvm_emulate_one(hvmemul_ctxt, &hvm_emulate_ops); > + return _hvm_emulate_one(hvmemul_ctxt, &hvm_emulate_ops, > completion); > } > > int hvm_emulate_one_mmio(unsigned long mfn, unsigned long gla) > @@ -2703,11 +2726,13 @@ int hvm_emulate_one_mmio(unsigned long m > .read = x86emul_unhandleable_rw, > .insn_fetch = hvmemul_insn_fetch, > .write = mmcfg_intercept_write, > + .validate = hvmemul_validate, > }; > static const struct x86_emulate_ops hvm_ro_emulate_ops_mmio = { > .read = x86emul_unhandleable_rw, > .insn_fetch = hvmemul_insn_fetch, > .write = mmio_ro_emulated_write, > + .validate = hvmemul_validate, > }; > struct mmio_ro_emulate_ctxt mmio_ro_ctxt = { .cr2 = gla }; > struct hvm_emulate_ctxt ctxt; > @@ -2727,8 +2752,8 @@ int hvm_emulate_one_mmio(unsigned long m > hvm_emulate_init_once(&ctxt, x86_insn_is_mem_write, > guest_cpu_user_regs()); > ctxt.ctxt.data = &mmio_ro_ctxt; > - rc = _hvm_emulate_one(&ctxt, ops); > - switch ( rc ) > + > + switch ( rc = _hvm_emulate_one(&ctxt, ops, HVMIO_no_completion) ) > { > case X86EMUL_UNHANDLEABLE: > case X86EMUL_UNIMPLEMENTED: > @@ -2755,7 +2780,8 @@ void hvm_emulate_one_vm_event(enum emul_ > switch ( kind ) > { > case EMUL_KIND_NOWRITE: > - rc = _hvm_emulate_one(&ctx, &hvm_emulate_ops_no_write); > + rc = _hvm_emulate_one(&ctx, &hvm_emulate_ops_no_write, > + HVMIO_no_completion); > break; > case EMUL_KIND_SET_CONTEXT_INSN: { > struct vcpu *curr = current; > @@ -2776,7 +2802,7 @@ void hvm_emulate_one_vm_event(enum emul_ > /* Fall-through */ > default: > ctx.set_context = (kind == EMUL_KIND_SET_CONTEXT_DATA); > - rc = hvm_emulate_one(&ctx); > + rc = hvm_emulate_one(&ctx, HVMIO_no_completion); > } > > switch ( rc ) > @@ -2874,6 +2900,8 @@ void hvm_emulate_init_per_insn( > pfec, NULL) == HVMTRANS_okay) ? > sizeof(hvmemul_ctxt->insn_buf) : 0; > } > + > + hvmemul_ctxt->is_mem_access = false; > } > > void hvm_emulate_writeback( > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -3798,7 +3798,7 @@ void hvm_ud_intercept(struct cpu_user_re > return; > } > > - switch ( hvm_emulate_one(&ctxt) ) > + switch ( hvm_emulate_one(&ctxt, HVMIO_no_completion) ) > { > case X86EMUL_UNHANDLEABLE: > case X86EMUL_UNIMPLEMENTED: > --- a/xen/arch/x86/hvm/io.c > +++ b/xen/arch/x86/hvm/io.c > @@ -81,20 +81,11 @@ void send_invalidate_req(void) > bool hvm_emulate_one_insn(hvm_emulate_validate_t *validate, const char > *descr) > { > struct hvm_emulate_ctxt ctxt; > - struct vcpu *curr = current; > - struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io; > int rc; > > hvm_emulate_init_once(&ctxt, validate, guest_cpu_user_regs()); > > - rc = hvm_emulate_one(&ctxt); > - > - if ( hvm_ioreq_needs_completion(&vio->io_req) ) > - vio->io_completion = HVMIO_mmio_completion; > - else > - vio->mmio_access = (struct npfec){}; > - > - switch ( rc ) > + switch ( rc = hvm_emulate_one(&ctxt, HVMIO_no_completion) ) > { > case X86EMUL_UNHANDLEABLE: > hvm_dump_emulation_state(XENLOG_G_WARNING, descr, &ctxt, rc); > --- a/xen/arch/x86/hvm/vmx/realmode.c > +++ b/xen/arch/x86/hvm/vmx/realmode.c > @@ -97,15 +97,11 @@ static void realmode_deliver_exception( > void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt) > { > struct vcpu *curr = current; > - struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io; > int rc; > > perfc_incr(realmode_emulations); > > - rc = hvm_emulate_one(hvmemul_ctxt); > - > - if ( hvm_ioreq_needs_completion(&vio->io_req) ) > - vio->io_completion = HVMIO_realmode_completion; > + rc = hvm_emulate_one(hvmemul_ctxt, HVMIO_realmode_completion); > > if ( rc == X86EMUL_UNHANDLEABLE ) > { > --- a/xen/include/asm-x86/hvm/emulate.h > +++ b/xen/include/asm-x86/hvm/emulate.h > @@ -48,6 +48,8 @@ struct hvm_emulate_ctxt { > > uint32_t intr_shadow; > > + bool is_mem_access; > + > bool_t set_context; > }; > > @@ -62,7 +64,8 @@ bool __nonnull(1, 2) hvm_emulate_one_ins > hvm_emulate_validate_t *validate, > const char *descr); > int hvm_emulate_one( > - struct hvm_emulate_ctxt *hvmemul_ctxt); > + struct hvm_emulate_ctxt *hvmemul_ctxt, > + enum hvm_io_completion completion); > void hvm_emulate_one_vm_event(enum emul_kind kind, > unsigned int trapnr, > unsigned int errcode);

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.