[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Bug] Intel RMRR support with upstream Qemu

> On 24/07/17 17:42, Alexey G wrote:
> > Hi,
> >
> > On Mon, 24 Jul 2017 10:53:16 +0100
> > Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> wrote:
> >>> [Zhang, Xiong Y] Thanks for your suggestion.
> >>> Indeed, if I set mmi_hole >= 4G - RMRR_Base, this could fix my issue.
> >>> For this I still have two questions, could you help me ?
> >>> 1) If hvmloader do low memory relocation, hvmloader and qemu will see a
> >>> different guest memory layout . So qemu ram maybe overlop with mmio,
> >>> does xen have plan to fix this ?
> >>
> >> hvmloader doesn't do memory relocation - this ability is turned off by
> >> default. The reason for the issue is that libxl initially sets the size
> >> of lower MMIO hole (based on the RMRR regions present and their size)
> >> and doesn't communicate it to QEMU using 'max-ram-below-4g' argument.
> >>
> >> When you set 'mmio_hole' size parameter you basically forces libxl to
> >> pass this argument to QEMU.
> >>
> >> That means the proper fix would be to make libxl to pass this argument
> >> to QEMU in case there are RMRR regions present.
> >
> > I tend to disagree a bit.
> > What we lack actually is some way to perform a 'dynamical' physmem
> > relocation, when a guest domain is running already. Right now it works only
> > in the 'static' way - i.e. if memory layout was known for both QEMU and
> > hvmloader before starting a guest domain and with no means of arbitrarily
> > changing this layout at runtime when hvmloader runs.
> >
> > But, the problem is that overall MMIO hole(s) requirements are not known
> > exactly at the time the HVM domain being created. Some PCI devices will be
> > emulated, some will be merely passed through and yet there will be some
> > RMRR ranges. libxl can't know all this stuff - some comes from the host,
> > some comes from DM. So actual MMIO requirements are known to
> hvmloader at
> > the PCI bus enumeration time.
> >
> IMO hvmloader shouldn't really be allowed to relocate memory under any
> conditions. As Andrew said it's much easier to provision the hole
> statically in libxl during domain construction process and it doesn't
> really compromise any functionality. Having one more entity responsible
> for guest memory layout only makes things more convoluted.
> > libxl can be taught to retrieve all missing info from QEMU, but this way
> > will require to perform all grunt work of PCI BARs allocation in libxl
> > itself - in order to calculate the real MMIO hole(s) size, one needs to
> > take into account all PCI BARs sizes and their alignment requirements
> > diversity + existing gaps due to RMRR ranges... basically, libxl will
> > need to do most of hvmloader/pci.c's job.
> >
> The algorithm implemented in hvmloader for that is not complicated and
> can be moved to libxl easily. What we can do is to provision a hole big
> enough to include all the initially assigned PCI devices. We can also
> account for emulated MMIO regions if necessary. But, to be honest, it
> doesn't really matter since if there is no enough space in lower MMIO
> hole for some BARs - they can be easily relocated to upper MMIO
> hole by hvmloader or the guest itself (dynamically).
> Igor
[Zhang, Xiong Y] yes, If we could supply a big enough mmio hole and don't allow 
hvmloader to do relocate, things will be easier. But how could we supply a big 
enough mmio hole ?
a. statical set base address of mmio hole to 2G/3G.
b. Like hvmloader to probe all the pci devices and calculate mmio size. But 
this runs prior to qemu, how to probe pci devices ? 

> > My 2kop opinion here is that we don't need to move all PCI BAR allocation to
> > libxl, or invent some new QMP-interfaces, or introduce new hypercalls or
> > else. A simple and somewhat good solution would be to implement this
> missing
> > hvmloader <-> QEMU interface in the same manner how it is done in real
> > hardware.
> >
> > When we move some part of guest memory in 4GB range to address space
> above
> > 4GB via XENMEM_add_to_physmap, we basically perform what chipset's
> > 'remap' (aka reclaim) does. So we can implement this interface between
> > hvmloader and QEMU via providing custom emulation for MCH's
> > remap/TOLUD/TOUUD stuff in QEMU if xen_enabled().
> >
> > In this way hvmloader will calculate MMIO hole sizes as usual, relocate
> > some guest RAM above 4GB base and communicate this information to
> QEMU via
> > emulated host bridge registers -- so then QEMU will sync its memory layout
> > info to actual physmap's.
> >
Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.