[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] x86/hvmloader: select xenpci MMIO BAR UC or WB MTRR cache attribute



On Thu, Jun 05, 2025 at 06:16:59PM +0200, Roger Pau Monne wrote:
> The Xen PCI device (vendor ID 0x5853) exposed to x86 HVM guests doesn't
> have the functionality of a traditional PCI device.  The exposed MMIO BAR
> is used by some guests (including Linux) as a safe place to map foreign
> memory, including the grant table itself.
> 
> Traditionally BARs from devices have the uncacheable (UC) cache attribute
> from the MTRR, to ensure correct functionality of such devices.  hvmloader
> mimics this behavior and sets the MTRR attributes of both the low and high
> PCI MMIO windows (where BARs of PCI devices reside) as UC in MTRR.
> 
> This however causes performance issues for users of the Xen PCI device BAR,
> as for the purposes of mapping remote memory there's no need to use the UC
> attribute.  On Intel systems this is worked around by using iPAT, that
> allows the hypervisor to force the effective cache attribute of a p2m entry
> regardless of the guest PAT value.  AMD however doesn't have an equivalent
> of iPAT, and guest PAT values are always considered.
> 
> Linux commit:
> 
> 41925b105e34 xen: replace xen_remap() with memremap()
> 
> Attempted to mitigate this by forcing mappings of the grant-table to use
> the write-back (WB) cache attribute.  However Linux memremap() takes MTRRs
> into account to calculate which PAT type to use, and seeing the MTRR cache
> attribute for the region being UC the PAT also ends up as UC, regardless of
> the caller having requested WB.
> 
> As a workaround to allow current Linux to map the grant-table as WB using
> memremap() introduce an xl.cfg option (xenpci_bar_uc=0) that can be used to
> select whether the Xen PCI device BAR will have the UC attribute in MTRR.
> Such workaround in hvmloader should also be paired with a fix for Linux so
> it attempts to change the MTRR of the Xen PCI device BAR to WB by itself.
> 
> Overall, the long term solution would be to provide the guest with a safe
> range in the guest physical address space where mappings to foreign pages
> can be created.
> 
> Some vif throughput performance figures provided by Anthoine from a 8
> vCPUs, 4GB of RAM HVM guest(s) running on AMD hardware:
> 
> Without this patch:
> vm -> dom0: 1.1Gb/s
> vm -> vm:   5.0Gb/s
> 
> With the patch:
> vm -> dom0: 4.5Gb/s
> vm -> vm:   7.0Gb/s
> 
> Reported-by: Anthoine Bourgeois <anthoine.bourgeois@xxxxxxxxxx>
> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> ---
> Changes since v2:
>  - Add default value in xl.cfg.
>  - List xenstore path in the pandoc file.
>  - Adjust comment in hvmloader.
>  - Fix commit message MIO -> MMIO.
> 
> Changes since v1:
>  - Leave the xenpci BAR as UC by default.
>  - Introduce an option to not set it as UC.
> ---
>  docs/man/xl.cfg.5.pod.in                |  8 ++++
>  docs/misc/xenstore-paths.pandoc         |  5 +++
>  tools/firmware/hvmloader/config.h       |  2 +-
>  tools/firmware/hvmloader/pci.c          | 49 ++++++++++++++++++++++++-
>  tools/firmware/hvmloader/util.c         |  2 +-
>  tools/include/libxl.h                   |  9 +++++
>  tools/libs/light/libxl_create.c         |  1 +
>  tools/libs/light/libxl_dom.c            |  9 +++++
>  tools/libs/light/libxl_types.idl        |  1 +
>  tools/xl/xl_parse.c                     |  2 +
>  xen/include/public/hvm/hvm_xs_strings.h |  2 +
>  11 files changed, 86 insertions(+), 4 deletions(-)

I've noticed this is missing a changelog entry, I propose the
following:

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 1ee2f42e7405..23215a8cc1e6 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -15,6 +15,9 @@ The format is based on [Keep a 
Changelog](https://keepachangelog.com/en/1.0.0/)
  - On x86:
    - Restrict the cache flushing done as a result of guest physical memory map
      manipulations and memory type changes.
+   - Allow controlling the MTRR cache attribute of the Xen PCI device BAR
+     for HVM guests, to improve performance of guests using it to map the grant
+     table or foreign memory.
 
 ### Added
  - On x86:

I can fold into the patch if Oleksii and others agree.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.