[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/hvmloader: select xenpci MMIO BAR UC or WB MTRR cache attribute
On Fri, Jun 06, 2025 at 01:00:19PM +0000, Tu Dinh wrote: > Hi Roger, > > On 05/06/2025 18:20, Roger Pau Monne wrote: > > The Xen PCI device (vendor ID 0x5853) exposed to x86 HVM guests doesn't > > have the functionality of a traditional PCI device. The exposed MMIO BAR > > is used by some guests (including Linux) as a safe place to map foreign > > memory, including the grant table itself. > > > > Traditionally BARs from devices have the uncacheable (UC) cache attribute > > from the MTRR, to ensure correct functionality of such devices. hvmloader > > mimics this behavior and sets the MTRR attributes of both the low and high > > PCI MMIO windows (where BARs of PCI devices reside) as UC in MTRR. > > > > This however causes performance issues for users of the Xen PCI device BAR, > > as for the purposes of mapping remote memory there's no need to use the UC > > attribute. On Intel systems this is worked around by using iPAT, that > > allows the hypervisor to force the effective cache attribute of a p2m entry > > regardless of the guest PAT value. AMD however doesn't have an equivalent > > of iPAT, and guest PAT values are always considered. > > > > Linux commit: > > > > 41925b105e34 xen: replace xen_remap() with memremap() > > > > Attempted to mitigate this by forcing mappings of the grant-table to use > > the write-back (WB) cache attribute. However Linux memremap() takes MTRRs > > into account to calculate which PAT type to use, and seeing the MTRR cache > > attribute for the region being UC the PAT also ends up as UC, regardless of > > the caller having requested WB. > > > > As a workaround to allow current Linux to map the grant-table as WB using > > memremap() introduce an xl.cfg option (xenpci_bar_uc=0) that can be used to > > select whether the Xen PCI device BAR will have the UC attribute in MTRR. > > Such workaround in hvmloader should also be paired with a fix for Linux so > > it attempts to change the MTRR of the Xen PCI device BAR to WB by itself. > > > > Overall, the long term solution would be to provide the guest with a safe > > range in the guest physical address space where mappings to foreign pages > > can be created. > > > > Some vif throughput performance figures provided by Anthoine from a 8 > > vCPUs, 4GB of RAM HVM guest(s) running on AMD hardware: > > > > Without this patch: > > vm -> dom0: 1.1Gb/s > > vm -> vm: 5.0Gb/s > > > > With the patch: > > vm -> dom0: 4.5Gb/s > > vm -> vm: 7.0Gb/s > > > > Reported-by: Anthoine Bourgeois <anthoine.bourgeois@xxxxxxxxxx> > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > > --- > > Changes since v2: > > - Add default value in xl.cfg. > > - List xenstore path in the pandoc file. > > - Adjust comment in hvmloader. > > - Fix commit message MIO -> MMIO. > > > > Changes since v1: > > - Leave the xenpci BAR as UC by default. > > - Introduce an option to not set it as UC. > > --- > > docs/man/xl.cfg.5.pod.in | 8 ++++ > > docs/misc/xenstore-paths.pandoc | 5 +++ > > tools/firmware/hvmloader/config.h | 2 +- > > tools/firmware/hvmloader/pci.c | 49 ++++++++++++++++++++++++- > > tools/firmware/hvmloader/util.c | 2 +- > > tools/include/libxl.h | 9 +++++ > > tools/libs/light/libxl_create.c | 1 + > > tools/libs/light/libxl_dom.c | 9 +++++ > > tools/libs/light/libxl_types.idl | 1 + > > tools/xl/xl_parse.c | 2 + > > xen/include/public/hvm/hvm_xs_strings.h | 2 + > > 11 files changed, 86 insertions(+), 4 deletions(-) > > > > diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in > > index c388899306c2..ddbff6fffc16 100644 > > --- a/docs/man/xl.cfg.5.pod.in > > +++ b/docs/man/xl.cfg.5.pod.in > > @@ -2351,6 +2351,14 @@ Windows > > L<https://xenproject.org/windows-pv-drivers/>. > > Setting B<xen_platform_pci=0> with the default device_model "qemu-xen" > > requires at least QEMU 1.6. > > > > + > > +=item B<xenpci_bar_uc=BOOLEAN> > > + > > +B<x86 only:> Select whether the memory BAR of the Xen PCI device should > > have > > +uncacheable (UC) cache attribute set in MTRR. > > + > > +Default is B<true>. > > + > > =item B<viridian=[ "GROUP", "GROUP", ...]> or B<viridian=BOOLEAN> > > > > The groups of Microsoft Hyper-V (AKA viridian) compatible enlightenments > > diff --git a/docs/misc/xenstore-paths.pandoc > > b/docs/misc/xenstore-paths.pandoc > > index 01a340fafcbe..073bed91eec1 100644 > > --- a/docs/misc/xenstore-paths.pandoc > > +++ b/docs/misc/xenstore-paths.pandoc > > @@ -234,6 +234,11 @@ These xenstore values are used to override some of the > > default string > > values in the SMBIOS table constructed in hvmloader. See the SMBIOS > > table specification at http://www.dmtf.org/standards/smbios/ > > > > +#### ~/hvmloader/pci/xenpci-bar-uc = ("1"|"0") [HVM,INTERNAL] > > + > > +Select whether the Xen PCI device MMIO BAR will have the uncacheable cache > > +attribute set in the MTRRs by hvmloader. > > + > > #### ~/bios-strings/oem-* = STRING [HVM,INTERNAL] > > > > 1 to 99 OEM strings can be set in xenstore using values of the form > > diff --git a/tools/firmware/hvmloader/config.h > > b/tools/firmware/hvmloader/config.h > > index 6e1da137d779..c159db30eea9 100644 > > --- a/tools/firmware/hvmloader/config.h > > +++ b/tools/firmware/hvmloader/config.h > > @@ -58,7 +58,7 @@ extern uint32_t *cpu_to_apicid; > > #define ACPI_TIS_HDR_ADDRESS 0xFED40F00UL > > > > extern uint32_t pci_mem_start; > > -extern const uint32_t pci_mem_end; > > +extern uint32_t pci_mem_end; > > extern uint64_t pci_hi_mem_start, pci_hi_mem_end; > > > > extern bool acpi_enabled; > > diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c > > index cc67b18c0361..747f6cfb6794 100644 > > --- a/tools/firmware/hvmloader/pci.c > > +++ b/tools/firmware/hvmloader/pci.c > > @@ -30,7 +30,7 @@ > > #include <xen/hvm/e820.h> > > > > uint32_t pci_mem_start = HVM_BELOW_4G_MMIO_START; > > -const uint32_t pci_mem_end = RESERVED_MEMBASE; > > +uint32_t pci_mem_end = RESERVED_MEMBASE; > > uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0; > > > > /* > > @@ -116,6 +116,8 @@ void pci_setup(void) > > * experience the memory relocation bug described below. > > */ > > bool allow_memory_relocate = 1; > > + /* Select the MTRR cache attribute of the xenpci device BAR. */ > > + bool xenpci_bar_uc = false; > > Since this is meant to be a workaround, I wonder if it makes more sense > to flip the setting (`xenpci_bar_wb`) and make it 0 by default? I originally didn't want to go that route, because while it's true that the default MTRR type is set to WB, and so any memory not covered by a MTRR range will default to that memory type I got the impression this was inferring too much. Overall my intention would be for inverting the default long term, and libxl setting build_info->u.hvm.xenpci_bar_uc = false by default, which then makes all the naming nicer IMO. > It also > simplifies the logic for both hvmloader and the consumer (no need for > double negatives). I don't think there are double negatives? That would happen if the variable was named xenpci_bar_no_uc or similar? Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |