Xen project Mailing List

Re: [PATCH 3/5] x86/hvm: fix handling of accesses to partial r/o MMIO pages

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Tue, 15 Apr 2025 12:04:03 +0200

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Marek Marczykowski <marmarek@xxxxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 15 Apr 2025 10:04:23 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Apr 15, 2025 at 11:41:27AM +0200, Jan Beulich wrote: > On 15.04.2025 10:34, Roger Pau Monné wrote: > > On Tue, Apr 15, 2025 at 09:32:37AM +0200, Jan Beulich wrote: > >> On 14.04.2025 18:13, Roger Pau Monné wrote: > >>> On Mon, Apr 14, 2025 at 05:24:32PM +0200, Jan Beulich wrote: > >>>> On 14.04.2025 15:53, Roger Pau Monné wrote: > >>>>> On Mon, Apr 14, 2025 at 08:33:44AM +0200, Jan Beulich wrote: > >>>>>> I'm also concerned of e.g. VT-x'es APIC access MFN, which is > >>>>>> p2m_mmio_direct. > >>>>> > >>>>> But that won't go into hvm_hap_nested_page_fault() when using > >>>>> cpu_has_vmx_virtualize_apic_accesses (and thus having an APIC page > >>>>> mapped as p2m_mmio_direct)? > >>>>> > >>>>> It would instead be an EXIT_REASON_APIC_ACCESS vmexit which is handled > >>>>> differently? > >>>> > >>>> All true as long as things work as expected (potentially including the > >>>> guest > >>>> also behaving as expected). Also this was explicitly only an example I > >>>> could > >>>> readily think of. I'm simply wary of handle_mmio_with_translation() now > >>>> getting things to handle it's not meant to ever see. > >>> > >>> How was access to MMIO r/o regions supposed to be handled before > >>> 33c19df9a5a0 (~2015)? I see that setting r/o MMIO p2m entries was > >>> added way before to p2m_type_to_flags() and ept_p2m_type_to_flags() > >>> (~2010), yet I can't figure out how writes would be handled back then > >>> that didn't result in a p2m fault and crashing of the domain. > >> > >> Was that handled at all before said change? > > > > Not really AFAICT, hence me wondering how where write accesses to r/o > > MMIO regions supposed to be handled by (non-priv) domains. Was the > > expectation that those writes trigger an p2m violation thus crashing > > the domain? > > I think so, yes. Devices with such special areas weren't (aren't?) supposed > to be handed to DomU-s. Oh, I see. That makes stuff a bit clearer. I think we would then also want to add some checks to {ept_}p2m_type_to_flags()? I wonder why handling of mmio_ro_ranges was added to the HVM p2m code in ~2010 then. If mmio_ro_ranges is only supposed to be relevant for the hardware domain in ~2010 an HVM dom0 was not even in sight? Sorry to ask so many questions, I'm a bit confused about how this was/is supposed to work. > >> mmio_ro_do_page_fault() was > >> (and still is) invoked for the hardware domain only, and quite likely > >> the need for handling (discarding) writes for PVHv1 had been overlooked > >> until someone was hit by the lack thereof. > > > > I see, I didn't realize r/o MMIO was only handled for PV hardware > > domains only. I could arguably do the same for HVM in > > hvm_hap_nested_page_fault(). > > > > Not sure whether the subpage stuff is supposed to be functional for > > domains different than the hardware domain? It seems to be available > > to the hanrdware domain only for PV guests, while for HVM is available > > for both PV and HVM domains: > > DYM Dom0 and DomU here? Indeed, sorry. > > is_hardware_domain(currd) || subpage_mmio_write_accept(mfn, gla) > > > > In hvm_hap_nested_page_fault(). > > See the three XHCI_SHARE_* modes. When it's XHCI_SHARE_ANY, even DomU-s > would require this handling. It looks like a mistake that we permit the > path to be taken for DomU-s even when the mode is XHCI_SHARE_HWDOM. Arguable a domU will never get the device assigned in the first place unless the share mode is set to XHCI_SHARE_ANY. For the other modes the device is hidden, and hence couldn't be assigned to a domU anyway. > It > also looks like a mistake that the PV path has remained Dom0-only, even > in the XHCI_SHARE_ANY case. Cc-ing Marek once again ... > > >>> I'm happy to look at other ways to handling this, but given there's > >>> current logic for handling accesses to read-only regions in > >>> hvm_hap_nested_page_fault() I think re-using that was the best way to > >>> also handle accesses to MMIO read-only regions. > >>> > >>> Arguably it would already be the case that for other reasons Xen would > >>> need to emulate an instruction that accesses a read-only MMIO region? > >> > >> Aiui hvm_translate_get_page() will yield HVMTRANS_bad_gfn_to_mfn for > >> p2m_mmio_direct (after all, "direct" means we expect no emulation is > >> needed; while arguably wrong for the introspection case, I'm not sure > >> that and pass-through actually go together). Hence it's down to > >> hvmemul_linear_mmio_access() -> hvmemul_phys_mmio_access() -> > >> hvmemul_do_mmio_buffer() -> hvmemul_do_io_buffer() -> hvmemul_do_io(), > >> which means that if hvm_io_intercept() can't handle it, the access > >> will be forwarded to the responsible DM, or be "processed" by the > >> internal null handler. > >> > >> Given this, perhaps what you do is actually fine. At the same time > >> note how several functions in hvm/emulate.c simply fail upon > >> encountering p2m_mmio_direct. These are all REP handlers though, so > >> the main emulator would then try emulating the insn the non-REP way. > > > > I'm open to alternative ways of handling such accesses, just used what > > seemed more natural in the context of hvm_hap_nested_page_fault(). > > > > Emulation of r/o MMIO accesses failing wouldn't be an issue from Xen's > > perspective, that would "just" result in the guest getting a #GP > > injected. > > That's not the part I'm worried about. What worries me is that we open up > another (or better: we're widening a) way to hit the emulator in the first > place. (Plus, as said, the issue with the not really tidy P2M type system.) But the hit would be limited to domains having r/o p2m_mmio_direct entries in the p2m, as otherwise the path would be unreachable? > > Would you like me to add some of your reasoning above to the commit > > message? > > While I'd still be a little hesitant as to ack-ing of the result, I think > that's all I'm really asking for, yes. As said before - I'm happy to consider suggestions here, I don't want to fix this with yet another bodge that will cause us further issues down the road. What I proposed seemed like the most natural way to handle those accesses IMO, but I'm not an expert on the emulator. Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.