[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/17] Q35 initial support for HVM guests



On Tue, 28 Apr 2026 09:48:41 +0200
Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:

>On Fri, Mar 13, 2026 at 04:35:01PM +0000, Thierry Escande wrote:
>> This series introduces initial Q35 chipset support for HVM guests,
>> based on the patchset at [1] by Alexey Gerasimenko.
>> 
>> Basic support means that this patchset allows to start an HVM guest
>> that emulates a Q35 chipset via Qemu and implements access to PCIe
>> extended configuration space for such devices emulated by Qemu.
>> 
>> Support for PCIe device passthrough is not implemented yet. This is
>> planned but implies modifications in the hypervisor and the
>> firmwares, mainly for the support of multiple PCI buses.
>
>Why do you need multi bus support to expose PCIe capabilities?  I'm
>not seeing the relation between those two.  You could still expose a
>single bus on the MCFG table.

The problem with the PCIe bus is that it's very "topological" by design
- and it always wants a valid hierarchy.

Each PCIe device manifests itself (via its PCIe Capabilities entry)
as either a chipset-integrated device or a regular PCIe endpoint
device, which is the most common case. There are more types IIRC but
these are what we deal with mostly - both for PT devices and
QEMU-emulated ones.

But, being a PCIe endpoint means that the device must have some parent
device. It can be located below a PCIe switch or, in the simplest and
the most common case, below a PCIe Root Port device.

In both cases the 'parent' is a PCI-PCI bridge technically, with the
PCIe endpoint device being located on its secondary bus.

As the Q35 patch series was done with mostly PCIe device passthrough in
mind, this brings the main complication - in order to properly place a
passed through device on the PCIe bus, we need an emulated/real/hybrid
Root Port device.

A much lengthier description is in this patch message:
https://lists.xenproject.org/archives/html/xen-devel/2018-03/msg01197.html

To summarize, we need this 'valid PCIe topology' nonsense just to make
Windows kernel (pci.sys driver specifically) not to discard our PT
device due to checking PCIe bus hierarchy above it.

This limitation was found/confirmed via debugging - luckily, pci.sys
had symbols and the main bad function which was failing had a very
speaking name - something like pcieCheckTopology or similar.

Emulating the "chipset-integrated device" in PT device's PCIe
Capabilities was a simple hack which allowed to bypass the requirement
to have a valid PCIe hierarchy with multiple buses. But the proper
future direction is implementing emulation of Root Ports or PCIe
switches I guess.

>> The PCIe MMCONFIG area is configured by hvmloader and its base
>> address and size are set in Xen using a new pair of hypercalls
>> HVMOP_get|set_ecam_space.
>
>I guess I will see how that looks like in the series, but the setting
>of the ECAM region would better be done by the toolstack.  Setting it
>in hvmloader is possibly not the best placement, because it doesn't
>run for PVH guests (and we will want ECAM support for PVH at some
>point), and there's also a vague plan/intention to get rid of
>hvmloader even for HVM guests eventually.

This is the situation where the difference between HVM and PVH might be
very problematic I'm afraid. HVM guests assume full freedom over the
IO/MMIO resources setup inside their sandboxed environment.

It's not just Windows reallocating PCI BARs to its liking, but also
spans to the emulated chipset's resources. In worst case we could have
MMCONFIG reinitialization implemented even in Intel's Q35 drivers
installed inside an HVM guest. Fortunately, this is not what I remember
was the case, but in theory Q35 driver could have done things like this.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.