[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 2/2] x86/dom0: attempt to fixup p2m page-faults for PVH dom0
On Fri, Feb 14, 2025 at 12:53:01PM +0100, Jan Beulich wrote: > On 14.02.2025 10:29, Roger Pau Monne wrote: > > --- a/docs/misc/xen-command-line.pandoc > > +++ b/docs/misc/xen-command-line.pandoc > > @@ -822,7 +822,8 @@ Specify the bit width of the DMA heap. > > > > ### dom0 > > = List of [ pv | pvh, shadow=<bool>, verbose=<bool>, > > - cpuid-faulting=<bool>, msr-relaxed=<bool> ] (x86) > > + cpuid-faulting=<bool>, msr-relaxed=<bool>, > > + pf-fixup=<bool> ] (x86) > > > > = List of [ sve=<integer> ] (Arm64) > > > > @@ -883,6 +884,19 @@ Controls for how dom0 is constructed on x86 systems. > > > > If using this option is necessary to fix an issue, please report a bug. > > > > +* The `pf-fixup` boolean is only applicable when using a PVH dom0 and > > + defaults to false. > > + > > + When running dom0 in PVH mode the dom0 kernel has no way to map MMIO > > + regions into the p2m, such mode relies on Xen dom0 builder populating > > + the p2m with all MMIO regions that dom0 should access. However Xen > > + doesn't have a complete picture of the host memory map, due to not > > + being able to process ACPI dynamic tables. > > + > > + The `pf-fixup` option allows Xen to attempt to add missing MMIO regions > > + to the p2m in response to page-faults generated by dom0 trying to > > access > > + unpopulated entries in the p2m. > > I wonder if this is to implementation focused for a command line option doc. > In particular the multiple uses of "p2m" are standing out in this regard. Hm, let me try to change p2m with 'dom0 physical memory map' or similar. > > --- a/xen/arch/x86/dom0_build.c > > +++ b/xen/arch/x86/dom0_build.c > > @@ -286,6 +286,10 @@ int __init parse_arch_dom0_param(const char *s, const > > char *e) > > opt_dom0_cpuid_faulting = val; > > else if ( (val = parse_boolean("msr-relaxed", s, e)) >= 0 ) > > opt_dom0_msr_relaxed = val; > > +#ifdef CONFIG_HVM > > + else if ( (val = parse_boolean("pf-fixup", s, e)) >= 0 ) > > + opt_dom0_pf_fixup = val; > > +#endif > > else > > return -EINVAL; > > I fear the scope of these sub-options is getting increasingly confusing. > opt_dom0_msr_relaxed is what its name says - specific to Dom0. > opt_dom0_cpuid_faulting, otoh, is a control domain option (i.e. also > applicable to a [hypothetical?] late ctrldom). Now you add an option > that's applicable to the hardware domain, i.e. also coverting late-hwdom. It's kind of a mixed bag, but ATM I would leave it as-is because it's likely easier for users to find the options if they are grouped together. WE can always add more fine grained options if there's a desired for them. > > --- a/xen/arch/x86/hvm/emulate.c > > +++ b/xen/arch/x86/hvm/emulate.c > > @@ -338,8 +338,38 @@ static int hvmemul_do_io( > > if ( !s ) > > { > > if ( is_mmio && is_hardware_domain(currd) ) > > - gdprintk(XENLOG_DEBUG, "unhandled memory %s to %#lx size > > %u\n", > > - dir ? "read" : "write", addr, size); > > + { > > + /* > > + * PVH dom0 is likely missing MMIO mappings on the p2m, > > due to > > + * the incomplete information Xen has about the memory > > layout. > > + * > > + * Either print a message to note dom0 attempted to access > > an > > + * unpopulated GPA, or try to fixup the p2m by creating an > > + * identity mapping for the faulting GPA. > > + */ > > + if ( opt_dom0_pf_fixup ) > > + { > > + int inner_rc = hvm_hwdom_fixup_p2m(addr); > > Why not use rc, as we do elsewhere in the function? hvm_hwdom_fixup_p2m() returns an errno, while rc in this context contains X86EMUL_ values. I could indeed re-use rc, it just felt wrong to mix different error address spaces on the same variable. > > --- a/xen/arch/x86/hvm/hvm.c > > +++ b/xen/arch/x86/hvm/hvm.c > > @@ -13,6 +13,7 @@ > > #include <xen/lib.h> > > #include <xen/trace.h> > > #include <xen/sched.h> > > +#include <xen/iocap.h> > > #include <xen/irq.h> > > #include <xen/softirq.h> > > #include <xen/domain.h> > > @@ -5458,6 +5459,36 @@ int hvm_copy_context_and_params(struct domain *dst, > > struct domain *src) > > return rc; > > } > > > > +bool __ro_after_init opt_dom0_pf_fixup; > > +int hvm_hwdom_fixup_p2m(paddr_t addr) > > The placement here looks odd to me. Why not as static function in emulate.c? > Or alternatively why not as p2m_hwdom_fixup() in mm/p2m.c? I don't have a strong opinion, if you are fine with it a static function in emulate.c might be the best then. > > +{ > > + unsigned long gfn = paddr_to_pfn(addr); > > + struct domain *currd = current->domain; > > + p2m_type_t type; > > + mfn_t mfn; > > + int rc; > > + > > + ASSERT(is_hardware_domain(currd)); > > + ASSERT(!altp2m_active(currd)); > > + > > + /* > > + * Fixups are only applied for MMIO holes, and rely on the hardware > > domain > > + * having identity mappings for non RAM regions (gfn == mfn). > > + */ > > + if ( !iomem_access_permitted(currd, gfn, gfn) || > > + !is_memory_hole(_mfn(gfn), _mfn(gfn)) ) > > + return -EPERM; > > + > > + mfn = get_gfn(currd, gfn, &type); > > + if ( !mfn_eq(mfn, INVALID_MFN) || !p2m_is_hole(type) ) > > + rc = mfn_eq(mfn, _mfn(gfn)) ? 0 : -EEXIST; > > I understand this is to cover the case where two vCPU-s access the same GFN > at about the same time. However, the "success" log message at the call site > being debug-only means we may be silently hiding bugs in release builds, if > e.g. we get here despite the GFN having had an identity mapping already for > ages. Possibly, but what would be your suggestion to fix this? I will think about it, but I can't immediately see a solution that's not simply to make the message printed by the caller to be gprintk() instead of gdprintk() so catch such bugs. Would you agree to that? Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |