[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] x86/dom0: attempt to fixup p2m page-faults for PVH dom0


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 17 Feb 2025 11:27:52 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>, Community Manager <community.manager@xxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 17 Feb 2025 10:27:54 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 17.02.2025 11:20, Roger Pau Monné wrote:
> On Mon, Feb 17, 2025 at 09:44:28AM +0100, Jan Beulich wrote:
>> On 17.02.2025 09:25, Roger Pau Monné wrote:
>>> On Fri, Feb 14, 2025 at 02:07:05PM +0100, Jan Beulich wrote:
>>>> On 14.02.2025 13:38, Roger Pau Monné wrote:
>>>>> On Fri, Feb 14, 2025 at 12:53:01PM +0100, Jan Beulich wrote:
>>>>>> On 14.02.2025 10:29, Roger Pau Monne wrote:
>>>>>>> +{
>>>>>>> +    unsigned long gfn = paddr_to_pfn(addr);
>>>>>>> +    struct domain *currd = current->domain;
>>>>>>> +    p2m_type_t type;
>>>>>>> +    mfn_t mfn;
>>>>>>> +    int rc;
>>>>>>> +
>>>>>>> +    ASSERT(is_hardware_domain(currd));
>>>>>>> +    ASSERT(!altp2m_active(currd));
>>>>>>> +
>>>>>>> +    /*
>>>>>>> +     * Fixups are only applied for MMIO holes, and rely on the 
>>>>>>> hardware domain
>>>>>>> +     * having identity mappings for non RAM regions (gfn == mfn).
>>>>>>> +     */
>>>>>>> +    if ( !iomem_access_permitted(currd, gfn, gfn) ||
>>>>>>> +         !is_memory_hole(_mfn(gfn), _mfn(gfn)) )
>>>>>>> +        return -EPERM;
>>>>>>> +
>>>>>>> +    mfn = get_gfn(currd, gfn, &type);
>>>>>>> +    if ( !mfn_eq(mfn, INVALID_MFN) || !p2m_is_hole(type) )
>>>>>>> +        rc = mfn_eq(mfn, _mfn(gfn)) ? 0 : -EEXIST;
>>>>>>
>>>>>> I understand this is to cover the case where two vCPU-s access the same 
>>>>>> GFN
>>>>>> at about the same time. However, the "success" log message at the call 
>>>>>> site
>>>>>> being debug-only means we may be silently hiding bugs in release builds, 
>>>>>> if
>>>>>> e.g. we get here despite the GFN having had an identity mapping already 
>>>>>> for
>>>>>> ages.
>>>>>
>>>>> Possibly, but what would be your suggestion to fix this?  I will think
>>>>> about it, but I can't immediately see a solution that's not simply to
>>>>> make the message printed by the caller to be gprintk() instead of
>>>>> gdprintk() so catch such bugs.  Would you agree to that?
>>>>
>>>> My thinking was that it might be best to propagate a distinguishable error
>>>> code (perhaps -EEXIST, with its present use then replaced) out of the 
>>>> function,
>>>> and make the choice of gprintk() vs gdprintk() depend on that. Accompanied 
>>>> by a
>>>> comment explaining things a little.
>>>
>>> I think it would be easier if I just made those gprintk() instead of
>>> gdprintk(), all with severity XENLOG_DEBUG except for the one that
>>> reports the failure of the fixup function that is XENLOG_WARNING.
>>> Would you be OK with that?
>>
>> Hmm. Okay-ish at best. Even if debug+guest-level messages are suppressed by
>> default, I think it wouldn't be nice if many of them might appear in release
>> builds with guest_loglevel=all. What I find difficult is to predict how high
>> the chances are to see any of them (and then possibly multiple times).
> 
> I think getting those messages even in non-debug builds might be
> helpful for debugging purposes.  Sometimes it's difficult for users to
> switch to a debug build of Xen if not provided by their upstream.
> 
> FWIW, on my Intel NUC I see three of those:
> 
> (XEN) [    5.418855] arch/x86/hvm/emulate.c:391:d0v0 fixup p2m mapping for 
> page fedc7 added
> (XEN) [    5.474574] arch/x86/hvm/emulate.c:391:d0v0 fixup p2m mapping for 
> page fd6a0 added
> (XEN) [    8.712784] arch/x86/hvm/emulate.c:391:d0v2 fixup p2m mapping for 
> page fe410 added

For my understanding: Did Xen with a PVH Dom0 not work on the NUC before? Or
else how come it survived without this fixing up of mappings?

> Would you be fine with this approach:
> 
> bool __ro_after_init opt_dom0_pf_fixup;
> static int hwdom_fixup_p2m(paddr_t addr)
> {
>     unsigned long gfn = paddr_to_pfn(addr);
>     struct domain *currd = current->domain;
>     p2m_type_t type;
>     mfn_t mfn;
>     int rc;
> 
>     ASSERT(is_hardware_domain(currd));
>     ASSERT(!altp2m_active(currd));
> 
>     /*
>      * Fixups are only applied for MMIO holes, and rely on the hardware domain
>      * having identity mappings for non RAM regions (gfn == mfn).
>      */
>     if ( !iomem_access_permitted(currd, gfn, gfn) ||
>          !is_memory_hole(_mfn(gfn), _mfn(gfn)) )
>         return -EPERM;
> 
>     mfn = get_gfn(currd, gfn, &type);
>     if ( !mfn_eq(mfn, INVALID_MFN) || !p2m_is_hole(type) )
>         rc = mfn_eq(mfn, _mfn(gfn)) ? -EEXIST : -ENOTEMPTY;
>     else
>         rc = set_mmio_p2m_entry(currd, _gfn(gfn), _mfn(gfn), 0);
>     put_gfn(currd, gfn);
> 
>     return rc;
> }
> [...]
>                     int inner_rc = hwdom_fixup_p2m(addr);
> 
>                     if ( !inner_rc || inner_rc == -EEXIST )
>                     {
>                         gdprintk(XENLOG_DEBUG,
>                                  "fixup p2m mapping for page %lx %s\n",
>                                  paddr_to_pfn(addr),
>                                  !inner_rc ? "added" : "already present");

As before, I think the "already present" message wants to be present also in
release build logs. As opposed to the "added" one. Yet at the same time, if
e.g. you and Andrew agree on the shape above, I won't stand in the way.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.