Xen project Mailing List

Re: [Xen-devel] [Bug] Intel RMRR support with upstream Qemu

To: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>

From: Alexey G <x1917x@xxxxxxxxx>

Date: Wed, 26 Jul 2017 02:40:51 +1000

Cc: "Zhang, Xiong Y" <xiong.y.zhang@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Tue, 25 Jul 2017 16:41:09 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Mon, 24 Jul 2017 21:39:08 +0100 Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> wrote: > > But, the problem is that overall MMIO hole(s) requirements are not known > > exactly at the time the HVM domain being created. Some PCI devices will > > be emulated, some will be merely passed through and yet there will be > > some RMRR ranges. libxl can't know all this stuff - some comes from the > > host, some comes from DM. So actual MMIO requirements are known to > > hvmloader at the PCI bus enumeration time. > > > > IMO hvmloader shouldn't really be allowed to relocate memory under any > conditions. As Andrew said it's much easier to provision the hole > statically in libxl during domain construction process and it doesn't > really compromise any functionality. Having one more entity responsible > for guest memory layout only makes things more convoluted. If moving most tasks of hvmloader to libxl is a planned feature in Citrix, please let it be discussed on xen-devel first as it may affect many people... and not all of them might be happy. :) (tons of IMO and TLDR ahead, be warned) Moving PCI BAR allocation from guest side to libxl is a controversial step. This may be the architecturally wrong way in fact. There are properties and areas of responsibility. Among primary responsibilities of guest's firmware is PCI BARs and MMIO hole size allocation. That's a guest's territory. Guest relocates PCI BARs (and not just BIOS able to do this), guest firmware relocates MMIO hole base for them. If it was a real system, all tasks like PCI BAR allocation, remapping part of RAM above 4G etc were done by system BIOS. In our case some of SeaBIOS/OVMF responsibilities were offloaded to hvmloader, like PCI BARs allocation, sizing MMIO hole(s) for them and generating ACPI tables. And that's ok as hvmloader can be considered merely a 'supplemental' firmware to perform some tasks of SeaBIOS/OVMF before passing control to them. This solution has some architecture logic at least and doesn't look bad. On other hand, moving PCI hole calculation to libxl just to let Xen/libxl know what the MMIO size value is might be a bad idea. Aside from some code duplication, straying too far from the real hw paths, or breaking existing (or future) interfaces this might have some other negative consequences. Ex. who will be initializing guest's ACPI tables if only libxl will know the memory layout? Some new interfaces between libxl and hvmloader just to let the latter know what values to write to ACPI tables being created? Or libxl will be initializing guest's ACPI tables as well (another guest's internal task)? Similar concerns are applicable to guest's final E820 construction. Another thing is that handling ioreq/PT MMIO ranges is somewhat a property of the device model (at least for now). Right now it's QEMU who traps PCI BAR accesses and tells Xen how to handle specific ranges of MMIO space. If QEMU already talks to Xen which ranges should be passed through or trapped -- it can tell him the current overall MMIO limits as well... or handle these limits himself -- if the MMIO hole range check is all what required to avoid MMIO space misusing, this check can be easily implemented in QEMU, provided that QEMU knows what memory/MMIO layout is. There is a lot of implementation freedom where to place restrictions and checks, Xen or QEMU. Strictly speaking, the MMIO hole itself can be considered a property of the emulated machine and may have implementation differences for different emulated chipsets. For example, the real i440' NB do not have an idea of high MMIO hole at all. We have already a sort of an interface between hvmloader and QEMU -- hvmloader has to do basic initialization for some emulated chipset's registers (and this depends on the machine). Providing additional handling for few other registers (TOM/TOLUD/etc) will cost almost nothing and purpose of this registers will actually match their usage in real HW. This way we can use an existing available interface and don't stray too far from the real HW ways. I want to try this approach for Q35 bringup patches for Xen I'm currently working on. I'll send these patches as RFC and will be glad to receive some constructive criticism. > > libxl can be taught to retrieve all missing info from QEMU, but this way > > will require to perform all grunt work of PCI BARs allocation in libxl > > itself - in order to calculate the real MMIO hole(s) size, one needs to > > take into account all PCI BARs sizes and their alignment requirements > > diversity + existing gaps due to RMRR ranges... basically, libxl will > > need to do most of hvmloader/pci.c's job. > > > > The algorithm implemented in hvmloader for that is not complicated and > can be moved to libxl easily. What we can do is to provision a hole big > enough to include all the initially assigned PCI devices. We can also > account for emulated MMIO regions if necessary. But, to be honest, it > doesn't really matter since if there is no enough space in lower MMIO > hole for some BARs - they can be easily relocated to upper MMIO > hole by hvmloader or the guest itself (dynamically). _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.