|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 2/2] x86/dom0: attempt to fixup p2m page-faults for PVH dom0
On 14.02.2025 13:38, Roger Pau Monné wrote:
> On Fri, Feb 14, 2025 at 12:53:01PM +0100, Jan Beulich wrote:
>> On 14.02.2025 10:29, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/hvm/emulate.c
>>> +++ b/xen/arch/x86/hvm/emulate.c
>>> @@ -338,8 +338,38 @@ static int hvmemul_do_io(
>>> if ( !s )
>>> {
>>> if ( is_mmio && is_hardware_domain(currd) )
>>> - gdprintk(XENLOG_DEBUG, "unhandled memory %s to %#lx size
>>> %u\n",
>>> - dir ? "read" : "write", addr, size);
>>> + {
>>> + /*
>>> + * PVH dom0 is likely missing MMIO mappings on the p2m,
>>> due to
>>> + * the incomplete information Xen has about the memory
>>> layout.
>>> + *
>>> + * Either print a message to note dom0 attempted to access
>>> an
>>> + * unpopulated GPA, or try to fixup the p2m by creating an
>>> + * identity mapping for the faulting GPA.
>>> + */
>>> + if ( opt_dom0_pf_fixup )
>>> + {
>>> + int inner_rc = hvm_hwdom_fixup_p2m(addr);
>>
>> Why not use rc, as we do elsewhere in the function?
>
> hvm_hwdom_fixup_p2m() returns an errno, while rc in this context
> contains X86EMUL_ values. I could indeed re-use rc, it just felt
> wrong to mix different error address spaces on the same variable.
Hmm, yes, I see.
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -13,6 +13,7 @@
>>> #include <xen/lib.h>
>>> #include <xen/trace.h>
>>> #include <xen/sched.h>
>>> +#include <xen/iocap.h>
>>> #include <xen/irq.h>
>>> #include <xen/softirq.h>
>>> #include <xen/domain.h>
>>> @@ -5458,6 +5459,36 @@ int hvm_copy_context_and_params(struct domain *dst,
>>> struct domain *src)
>>> return rc;
>>> }
>>>
>>> +bool __ro_after_init opt_dom0_pf_fixup;
>>> +int hvm_hwdom_fixup_p2m(paddr_t addr)
>>
>> The placement here looks odd to me. Why not as static function in emulate.c?
>> Or alternatively why not as p2m_hwdom_fixup() in mm/p2m.c?
>
> I don't have a strong opinion, if you are fine with it a static
> function in emulate.c might be the best then.
I'd be fine with either of the suggested options. mm/p2m.c is perhaps
the more logical home for such a function, yet the option of having it
static is quite appealing, too. Hence why I came to think of that one
first.
>>> +{
>>> + unsigned long gfn = paddr_to_pfn(addr);
>>> + struct domain *currd = current->domain;
>>> + p2m_type_t type;
>>> + mfn_t mfn;
>>> + int rc;
>>> +
>>> + ASSERT(is_hardware_domain(currd));
>>> + ASSERT(!altp2m_active(currd));
>>> +
>>> + /*
>>> + * Fixups are only applied for MMIO holes, and rely on the hardware
>>> domain
>>> + * having identity mappings for non RAM regions (gfn == mfn).
>>> + */
>>> + if ( !iomem_access_permitted(currd, gfn, gfn) ||
>>> + !is_memory_hole(_mfn(gfn), _mfn(gfn)) )
>>> + return -EPERM;
>>> +
>>> + mfn = get_gfn(currd, gfn, &type);
>>> + if ( !mfn_eq(mfn, INVALID_MFN) || !p2m_is_hole(type) )
>>> + rc = mfn_eq(mfn, _mfn(gfn)) ? 0 : -EEXIST;
>>
>> I understand this is to cover the case where two vCPU-s access the same GFN
>> at about the same time. However, the "success" log message at the call site
>> being debug-only means we may be silently hiding bugs in release builds, if
>> e.g. we get here despite the GFN having had an identity mapping already for
>> ages.
>
> Possibly, but what would be your suggestion to fix this? I will think
> about it, but I can't immediately see a solution that's not simply to
> make the message printed by the caller to be gprintk() instead of
> gdprintk() so catch such bugs. Would you agree to that?
My thinking was that it might be best to propagate a distinguishable error
code (perhaps -EEXIST, with its present use then replaced) out of the function,
and make the choice of gprintk() vs gdprintk() depend on that. Accompanied by a
comment explaining things a little.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |