[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/hvmloader: select xenpci MMIO BAR UC or WB MTRR cache attribute
On Thu, Jun 05, 2025 at 06:16:59PM +0200, Roger Pau Monne wrote: > The Xen PCI device (vendor ID 0x5853) exposed to x86 HVM guests doesn't > have the functionality of a traditional PCI device. The exposed MMIO BAR > is used by some guests (including Linux) as a safe place to map foreign > memory, including the grant table itself. > > Traditionally BARs from devices have the uncacheable (UC) cache attribute > from the MTRR, to ensure correct functionality of such devices. hvmloader > mimics this behavior and sets the MTRR attributes of both the low and high > PCI MMIO windows (where BARs of PCI devices reside) as UC in MTRR. > > This however causes performance issues for users of the Xen PCI device BAR, > as for the purposes of mapping remote memory there's no need to use the UC > attribute. On Intel systems this is worked around by using iPAT, that > allows the hypervisor to force the effective cache attribute of a p2m entry > regardless of the guest PAT value. AMD however doesn't have an equivalent > of iPAT, and guest PAT values are always considered. > > Linux commit: > > 41925b105e34 xen: replace xen_remap() with memremap() > > Attempted to mitigate this by forcing mappings of the grant-table to use > the write-back (WB) cache attribute. However Linux memremap() takes MTRRs > into account to calculate which PAT type to use, and seeing the MTRR cache > attribute for the region being UC the PAT also ends up as UC, regardless of > the caller having requested WB. > > As a workaround to allow current Linux to map the grant-table as WB using > memremap() introduce an xl.cfg option (xenpci_bar_uc=0) that can be used to > select whether the Xen PCI device BAR will have the UC attribute in MTRR. > Such workaround in hvmloader should also be paired with a fix for Linux so > it attempts to change the MTRR of the Xen PCI device BAR to WB by itself. > > Overall, the long term solution would be to provide the guest with a safe > range in the guest physical address space where mappings to foreign pages > can be created. > > Some vif throughput performance figures provided by Anthoine from a 8 > vCPUs, 4GB of RAM HVM guest(s) running on AMD hardware: > > Without this patch: > vm -> dom0: 1.1Gb/s > vm -> vm: 5.0Gb/s > > With the patch: > vm -> dom0: 4.5Gb/s > vm -> vm: 7.0Gb/s > > Reported-by: Anthoine Bourgeois <anthoine.bourgeois@xxxxxxxxxx> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > --- > Changes since v2: > - Add default value in xl.cfg. > - List xenstore path in the pandoc file. > - Adjust comment in hvmloader. > - Fix commit message MIO -> MMIO. > > Changes since v1: > - Leave the xenpci BAR as UC by default. > - Introduce an option to not set it as UC. > --- > docs/man/xl.cfg.5.pod.in | 8 ++++ > docs/misc/xenstore-paths.pandoc | 5 +++ > tools/firmware/hvmloader/config.h | 2 +- > tools/firmware/hvmloader/pci.c | 49 ++++++++++++++++++++++++- > tools/firmware/hvmloader/util.c | 2 +- > tools/include/libxl.h | 9 +++++ > tools/libs/light/libxl_create.c | 1 + > tools/libs/light/libxl_dom.c | 9 +++++ > tools/libs/light/libxl_types.idl | 1 + > tools/xl/xl_parse.c | 2 + > xen/include/public/hvm/hvm_xs_strings.h | 2 + > 11 files changed, 86 insertions(+), 4 deletions(-) I've noticed this is missing a changelog entry, I propose the following: diff --git a/CHANGELOG.md b/CHANGELOG.md index 1ee2f42e7405..23215a8cc1e6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -15,6 +15,9 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) - On x86: - Restrict the cache flushing done as a result of guest physical memory map manipulations and memory type changes. + - Allow controlling the MTRR cache attribute of the Xen PCI device BAR + for HVM guests, to improve performance of guests using it to map the grant + table or foreign memory. ### Added - On x86: I can fold into the patch if Oleksii and others agree. Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |