Xen project Mailing List

Re: [Xen-devel] [PATCH 6/6] x86/hvm: Implement hvmemul_write() using real mappings

To: "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Thu, 22 Jun 2017 03:06:11 -0600

Cc: Mihai Donțu <mdontu@xxxxxxxxxxxxxxx>, Paul Durrant <paul.durrant@xxxxxxxxxx>, Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Thu, 22 Jun 2017 09:06:35 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 21.06.17 at 17:12, <andrew.cooper3@xxxxxxxxxx> wrote: > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -498,6 +498,159 @@ static int hvmemul_do_mmio_addr(paddr_t mmio_gpa, > } > > /* > + * Map the frame(s) covering an individual linear access, for writeable > + * access. May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other errors > + * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings. > + * > + * In debug builds, map() checks that each slot in hvmemul_ctxt->mfn[] is > + * clean before use, and poisions unused slots with INVALID_MFN. > + */ > +static void *hvmemul_map_linear_addr( > + unsigned long linear, unsigned int bytes, uint32_t pfec, > + struct hvm_emulate_ctxt *hvmemul_ctxt) > +{ > + struct vcpu *curr = current; > + void *err, *mapping; > + > + /* First and final gfns which need mapping. */ > + unsigned long frame = linear >> PAGE_SHIFT, first = frame; > + unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT; I second Paul's desire for the zero bytes case to be rejected up the call stack. > + /* > + * mfn points to the next free slot. All used slots have a page > reference > + * held on them. > + */ > + mfn_t *mfn = &hvmemul_ctxt->mfn[0]; > + > + /* > + * The caller has no legitimate reason for trying a zero-byte write, but > + * final is calculate to fail safe in release builds. > + * > + * The maximum write size depends on the number of adjacent mfns[] which > + * can be vmap()'d, accouting for possible misalignment within the > region. > + * The higher level emulation callers are responsible for ensuring that > + * mfns[] is large enough for the requested write size. > + */ > + if ( bytes == 0 || > + final - first > ARRAY_SIZE(hvmemul_ctxt->mfn) - 1 ) Any reason not to use ">= ARRAY_SIZE(hvmemul_ctxt->mfn)" > + { > + ASSERT_UNREACHABLE(); > + goto unhandleable; > + } > + > + do { > + enum hvm_translation_result res; > + struct page_info *page; > + pagefault_info_t pfinfo; > + p2m_type_t p2mt; > + > + res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec, > + &pfinfo, &page, NULL, &p2mt); > + > + switch ( res ) > + { > + case HVMTRANS_okay: > + break; > + > + case HVMTRANS_bad_linear_to_gfn: > + x86_emul_pagefault(pfinfo.ec, pfinfo.linear, > &hvmemul_ctxt->ctxt); > + err = ERR_PTR(~(long)X86EMUL_EXCEPTION); > + goto out; > + > + case HVMTRANS_bad_gfn_to_mfn: > + err = NULL; > + goto out; > + > + case HVMTRANS_gfn_paged_out: > + case HVMTRANS_gfn_shared: > + err = ERR_PTR(~(long)X86EMUL_RETRY); > + goto out; > + > + default: > + goto unhandleable; > + } > + > + /* Error checking. Confirm that the current slot is clean. */ > + ASSERT(mfn_x(*mfn) == 0); Wouldn't this better be done first thing in the loop? And wouldn't the value better be INVALID_MFN? > + *mfn++ = _mfn(page_to_mfn(page)); > + frame++; > + > + if ( p2m_is_discard_write(p2mt) ) > + { > + err = ERR_PTR(~(long)X86EMUL_OKAY); > + goto out; If one page is discard-write and the other isn't, this will end up being wrong. > + } > + > + } while ( frame < final ); > + > + /* Entire access within a single frame? */ > + if ( first == final ) > + mapping = map_domain_page(hvmemul_ctxt->mfn[0]) + (linear & > ~PAGE_MASK); > + /* Multiple frames? Need to vmap(). */ > + else if ( (mapping = vmap(hvmemul_ctxt->mfn, > + mfn - hvmemul_ctxt->mfn)) == NULL ) final - first + 1 would likely yield better code. > + goto unhandleable; > + > +#ifndef NDEBUG /* Poision unused mfn[]s with INVALID_MFN. */ > + while ( mfn < hvmemul_ctxt->mfn + ARRAY_SIZE(hvmemul_ctxt->mfn) ) > + { > + ASSERT(mfn_x(*mfn) == 0); > + *mfn++ = INVALID_MFN; > + } > +#endif > + > + return mapping; > + > + unhandleable: > + err = ERR_PTR(~(long)X86EMUL_UNHANDLEABLE); > + > + out: > + /* Drop all held references. */ > + while ( mfn > hvmemul_ctxt->mfn ) > + put_page(mfn_to_page(mfn_x(*mfn--))); ITYM while ( mfn-- > hvmemul_ctxt->mfn ) put_page(mfn_to_page(mfn_x(*mfn))); or while ( mfn > hvmemul_ctxt->mfn ) put_page(mfn_to_page(mfn_x(*--mfn))); > +static void hvmemul_unmap_linear_addr( > + void *mapping, unsigned long linear, unsigned int bytes, Both vunmap() and unmap_domain_page() take pointers to const, so please use const on the pointer here too. > + struct hvm_emulate_ctxt *hvmemul_ctxt) There upsides and downsides to requiring the caller to pass in the same values as to map(): You can do more correctness checking here, but you also risk the caller using the wrong values (perhaps because of a meanwhile updated local variable). While I don't outright object to this approach, personally I'd prefer minimal inputs here, and the code deriving everything from hvmemul_ctxt. > @@ -1007,23 +1160,15 @@ static int hvmemul_write( > (vio->mmio_gla == (addr & PAGE_MASK)) ) > return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, > hvmemul_ctxt, 1); > > - rc = hvm_copy_to_guest_linear(addr, p_data, bytes, pfec, &pfinfo); > - > - switch ( rc ) > - { > - case HVMTRANS_okay: > - break; > - case HVMTRANS_bad_linear_to_gfn: > - x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt); > - return X86EMUL_EXCEPTION; > - case HVMTRANS_bad_gfn_to_mfn: > + mapping = hvmemul_map_linear_addr(addr, bytes, pfec, hvmemul_ctxt); > + if ( IS_ERR(mapping) ) > + return ~PTR_ERR(mapping); > + else if ( !mapping ) Pointless "else". > return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, > hvmemul_ctxt, 0); Considering the 2nd linear -> guest-phys translation done here, did you consider having hvmemul_map_linear_addr() obtain and provide the GFNs? > --- a/xen/include/asm-x86/hvm/emulate.h > +++ b/xen/include/asm-x86/hvm/emulate.h > @@ -37,6 +37,13 @@ struct hvm_emulate_ctxt { > unsigned long seg_reg_accessed; > unsigned long seg_reg_dirty; > > + /* > + * MFNs behind temporary mappings in the write callback. The length is > + * arbitrary, and can be increased if writes longer than PAGE_SIZE are > + * needed. > + */ > + mfn_t mfn[2]; Mind being precise in the comment, saying "PAGE_SIZE+1"? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.