[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/5] x86/hvm: fix handling of accesses to partial r/o MMIO pages



On Tue, Apr 15, 2025 at 11:41:27AM +0200, Jan Beulich wrote:
> On 15.04.2025 10:34, Roger Pau Monné wrote:
> > On Tue, Apr 15, 2025 at 09:32:37AM +0200, Jan Beulich wrote:
> >> On 14.04.2025 18:13, Roger Pau Monné wrote:
> >>> On Mon, Apr 14, 2025 at 05:24:32PM +0200, Jan Beulich wrote:
> >>>> On 14.04.2025 15:53, Roger Pau Monné wrote:
> >>>>> On Mon, Apr 14, 2025 at 08:33:44AM +0200, Jan Beulich wrote:
> >>>>>> I'm also concerned of e.g. VT-x'es APIC access MFN, which is
> >>>>>> p2m_mmio_direct.
> >>>>>
> >>>>> But that won't go into hvm_hap_nested_page_fault() when using
> >>>>> cpu_has_vmx_virtualize_apic_accesses (and thus having an APIC page
> >>>>> mapped as p2m_mmio_direct)?
> >>>>>
> >>>>> It would instead be an EXIT_REASON_APIC_ACCESS vmexit which is handled
> >>>>> differently?
> >>>>
> >>>> All true as long as things work as expected (potentially including the 
> >>>> guest
> >>>> also behaving as expected). Also this was explicitly only an example I 
> >>>> could
> >>>> readily think of. I'm simply wary of handle_mmio_with_translation() now
> >>>> getting things to handle it's not meant to ever see.
> >>>
> >>> How was access to MMIO r/o regions supposed to be handled before
> >>> 33c19df9a5a0 (~2015)?  I see that setting r/o MMIO p2m entries was
> >>> added way before to p2m_type_to_flags() and ept_p2m_type_to_flags()
> >>> (~2010), yet I can't figure out how writes would be handled back then
> >>> that didn't result in a p2m fault and crashing of the domain.
> >>
> >> Was that handled at all before said change?
> > 
> > Not really AFAICT, hence me wondering how where write accesses to r/o
> > MMIO regions supposed to be handled by (non-priv) domains.  Was the
> > expectation that those writes trigger an p2m violation thus crashing
> > the domain?
> 
> I think so, yes. Devices with such special areas weren't (aren't?) supposed
> to be handed to DomU-s.

Oh, I see.  That makes stuff a bit clearer.  I think we would then
also want to add some checks to {ept_}p2m_type_to_flags()?

I wonder why handling of mmio_ro_ranges was added to the HVM p2m code
in ~2010 then.  If mmio_ro_ranges is only supposed to be relevant for
the hardware domain in ~2010 an HVM dom0 was not even in sight?

Sorry to ask so many questions, I'm a bit confused about how this
was/is supposed to work.

> >> mmio_ro_do_page_fault() was
> >> (and still is) invoked for the hardware domain only, and quite likely
> >> the need for handling (discarding) writes for PVHv1 had been overlooked
> >> until someone was hit by the lack thereof.
> > 
> > I see, I didn't realize r/o MMIO was only handled for PV hardware
> > domains only.  I could arguably do the same for HVM in
> > hvm_hap_nested_page_fault().
> > 
> > Not sure whether the subpage stuff is supposed to be functional for
> > domains different than the hardware domain?  It seems to be available
> > to the hanrdware domain only for PV guests, while for HVM is available
> > for both PV and HVM domains:
> 
> DYM Dom0 and DomU here?

Indeed, sorry.

> > is_hardware_domain(currd) || subpage_mmio_write_accept(mfn, gla)
> > 
> > In hvm_hap_nested_page_fault().
> 
> See the three XHCI_SHARE_* modes. When it's XHCI_SHARE_ANY, even DomU-s
> would require this handling. It looks like a mistake that we permit the
> path to be taken for DomU-s even when the mode is XHCI_SHARE_HWDOM.

Arguable a domU will never get the device assigned in the first place
unless the share mode is set to XHCI_SHARE_ANY.  For the other modes
the device is hidden, and hence couldn't be assigned to a domU anyway.

> It
> also looks like a mistake that the PV path has remained Dom0-only, even
> in the XHCI_SHARE_ANY case. Cc-ing Marek once again ...
> 
> >>> I'm happy to look at other ways to handling this, but given there's
> >>> current logic for handling accesses to read-only regions in
> >>> hvm_hap_nested_page_fault() I think re-using that was the best way to
> >>> also handle accesses to MMIO read-only regions.
> >>>
> >>> Arguably it would already be the case that for other reasons Xen would
> >>> need to emulate an instruction that accesses a read-only MMIO region?
> >>
> >> Aiui hvm_translate_get_page() will yield HVMTRANS_bad_gfn_to_mfn for
> >> p2m_mmio_direct (after all, "direct" means we expect no emulation is
> >> needed; while arguably wrong for the introspection case, I'm not sure
> >> that and pass-through actually go together). Hence it's down to
> >> hvmemul_linear_mmio_access() -> hvmemul_phys_mmio_access() ->
> >> hvmemul_do_mmio_buffer() -> hvmemul_do_io_buffer() -> hvmemul_do_io(),
> >> which means that if hvm_io_intercept() can't handle it, the access
> >> will be forwarded to the responsible DM, or be "processed" by the
> >> internal null handler.
> >>
> >> Given this, perhaps what you do is actually fine. At the same time
> >> note how several functions in hvm/emulate.c simply fail upon
> >> encountering p2m_mmio_direct. These are all REP handlers though, so
> >> the main emulator would then try emulating the insn the non-REP way.
> > 
> > I'm open to alternative ways of handling such accesses, just used what
> > seemed more natural in the context of hvm_hap_nested_page_fault().
> > 
> > Emulation of r/o MMIO accesses failing wouldn't be an issue from Xen's
> > perspective, that would "just" result in the guest getting a #GP
> > injected.
> 
> That's not the part I'm worried about. What worries me is that we open up
> another (or better: we're widening a) way to hit the emulator in the first
> place. (Plus, as said, the issue with the not really tidy P2M type system.)

But the hit would be limited to domains having r/o p2m_mmio_direct
entries in the p2m, as otherwise the path would be unreachable?

> > Would you like me to add some of your reasoning above to the commit
> > message?
> 
> While I'd still be a little hesitant as to ack-ing of the result, I think
> that's all I'm really asking for, yes.

As said before - I'm happy to consider suggestions here, I don't want
to fix this with yet another bodge that will cause us further issues
down the road.  What I proposed seemed like the most natural way to
handle those accesses IMO, but I'm not an expert on the emulator.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.