[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Some questions regarding QEMU, UEFI, PCI/VGA Passthrough, and other things

While I am not a developer myself (I always sucked hard when it comes to read 
and write code), there are several capabilities of Xen and its supporting 
Software which I'm always interesed in how they progress, more out of curiosity 
than anything else. However, usually, documentation seems to backtrack a lot 
what its currently implemented in code, and sometimes you catch a mail here 
with some useful data regarding a topic but later you don't hear about that any 
more, missing any progress, or because the whole topic was inconclusive. So, 
this mail is pretty much a compilation of small questions of things I came 
across but didn't popped up later, but can serve to brainstorm someone, which 
is why I believe it to be more useful for xen-devel than xen-users.

Because as a VGA Passthrough user I'm currently forced to use 
qemu-xen-traditional (Through I hear some success about some users using 
qemu-xen in Xen 4.4, but I myself didn't had any luck with it), I'm stuck with 
an old QEMU version. However, looking at changelog from latest versions I 
always see some interesing features, which as far that I know Xen doesn't 
currently incorporate.

1a - One of the things that newer QEMU versions seems to be capable of doing, 
is emulating the much newer Intel Q35 Chipset, instead of only the current 
440FX from the P5 Pentium era. Some data from Q35 emulation here:

I'm aware that newer doesn't neccesarily means better, specially because the 
practical advantages of Q35 vs 440FX aren't very clear. There are several new 
emulated features like an AHCI Controller and a PCIe Bus, which sounds 
interesing on paper, but I don't know if they add any useful feature or 
increases performance/compatibility. Some comments I read about the matter 
wrongly stated that Q35 would be needed to do PCIe Passthrough, but this is 
currently possible on 440FX, through I don't know about the low level 
implementation differences. I think most of the idea about Q35 is to make the 
VM look more closely to real Hardware, instead of looking like a ridiculous 
obvious emulated platform.
In the case of the AHCI Controller, I suppose than the OS would need to include 
Drivers for the controller during installation time, which if I recall 
correctly both Windows Vista/7/8 and Linux should have, through for a Windows 
XP install the Q35 AHCI Controller Drivers should probabily need to be 
slipstreamed with nLite to an install ISO for it to work.

1b - Another experimental feature that recently popped in QEMU is IOMMU 
emulation. Info here:

IOMMU emulation usefulness seems to be so you can do PCI Passthrough in a 
Nested Virtualization enviroment. At first sight this looked a bit useless, 
cause using a DomU to do PCI Passthrough with an emulated IOMMU sounds rather 
too much overhead if you can simply emulate that device in the nested DomU. 
However, I also read about the possibility of Xen using Hardware virtualization 
for Dom0 instead of it being Paravirtualized. In that case, would it be 
possible to provide the IOMMU emulation layer to Dom0 so you could do PCI 
Passthrough in platforms without proper support for it? It seems a rather 
interesing idea.
I think it would also be useful to serve as an standarized debug platform for 
IOMMU virtualization and passthrough, cause some years ago missing or malformed 
ACPI DMAR/IVRS tables were all over the place and getting IOMMU virtualization 
working was pretty much random luck and at the mercy of the goodwill of the 
Motherboard maker to fix their BIOSes.

   UEFI for DomUs
I managed to get this one working, but it seems to need some clarifications 
here and there.

2a - As far that I know, if you add --enable-ovmf to ./configure before 
building Xen, it downloads and builds some extra code from a OVMF repository 
which Xen maintains, through I don't know if its a snapshop of whatever the 
edk2 repository had at that time, or if it does includes custom patchs for the 
OVMF Firmware to work in Xen. Xen also has another ./configure option, 
--with-system-ovmf, which is supposed to be used to specify a path to provide 
an OVMF Firmware binary. However, when I tried that option some months ago, I 
never managed to get it working, either using a package with a precompiled 
ovmf.bin from Arch Linux User Repository, or using another package with the 
source to compile it myself. Both binaries worked with standalone QEMU, through.
Besides than that parameter itself was quite hidden, there is absolutely no 
info regarding if the provided OVMF binary has to comply with some special 
requeriments, be it some custom patchs for OVMF so it works with Xen, if it has 
to be a binary that only includes TianoCore, or the unified one that includes 
the NVRAM in a single file. In Arch Linux, for the Xen 4.4 package, the 
maintainer decided that the way to go for including OVMF support to Xen was to 
use --enable-ovmf, cause at least it was possible to get it working with some 
available patches. However, for both download and build times, it would be 
better to simply distribute a working binary. Any ideas of why 
--with-system-ovmf didn't worked for us?

2b - On successful Xen builds with OVMF support, something which I looked for 
is the actual ovmf.bin file. So far, the only thing which I noticed is that the 
hvmloader is 2 MiB bigger that on non-OVMF builds. Is there any reason why OVMF 
is build into the hvmloader instead of what happens to the other Firmware 
binaries, which are usually sitting in a directory as standalone files?

2c - Something which I'm aware is that an OVMF binary can be in two formats: A 
unified binary that has both OVMF and NVRAM, or a OVMF binary with a separate 
NVRAM (1.87 MiB + 128 KiB respectively). According to what I read about using 
OVMF with QEMU, it seems that if using a unified binary, you need one per VM, 
cause the NVRAM content is different. I suppose than with the second option you 
have one OVMF Firmware binary and a 128 KB NVRAM per UEFI VM. How does Xen 
handles this? If I recall correctly, I heared than it is currently volatile 
(NVRAM contents aren't saved on DomU shutdown).

2d - Is there any recorded experience or info regarding how a UEFI DomU would 
behave with something like, say, Windows 8 with Fast Boot, or other UEFI 
features for native systems? This is pretty much a "what if..." scenario than 
something that I could really use.

   PCI/VGA Passthrough
It was four years ago when I learned about IOMMU virtualization making possible 
gaming in a VM via VGA Passthrough (First time I heared about that was with 
some of Teo En Ming videos on Youtube), something which was quite experimental 
back at that time. Even currently, the only other Hypervisor or VMM that can 
compete with Xen in this area is QEMU with KVM VFIO, which also has decent VGA 
Passthrough capabilities. While I'm aware that Xen is pretty much enterprise 
oriented, it was also the first to allow a power user to make a system based on 
Xen as Hypervisor and everything else virtualized, getting nearly all the 
functionality of running native with the flexibility than virtualization 
offers, at the cost of some overhead, quircks and complexity on usage. Its a 
pain to configure it the first time, but if you manage to get it working, its 
wonderful. So far, this feature has created a small niche of power users that 
uses either Xen or QEMU KVM VFIO for virtualized gaming, and I consider VGA 
Passthrough a quite major feature because it is what allows such setups on the 
first place.

3a - On some of the Threads of the original guides I read about how to use Xen 
to do VGA Passthrough, you usually see the author and others users saying that 
they didn't manage to get VGA Passthrough working on newer versions. This 
usually affected people that was doing the migration from the xm to xl 
toolstack, but also between some Xen versions (I reported a regression on Xen 
4.4 vs a fully working 4.3). Passthrough compatibility previously used to be a 
Hardware-related pain cause it was extremely Motherboard and BIOS dependant on 
an era where consumer Motherboards makers didn't paid attention to the IOMMU, 
but at least on the Intel Haswell platform support for IOMMU is starting to get 
more mainstream.
Considering than PCI/VGA Passthrough compatibility with a system or regressions 
of it between Xen versions is pretty much a hit-or-miss, would it be possible 
to do something to get this feature under control? It seems like this isn't 
deeply tested, or at least not with too many variables involved (Hard to do, 
cause they're A LOT). I believe that it should be possible to have a few 
systems at hand which are know to work and representative of a Hardware 
platform tested against regression with different Video Cards, but it sounds 
extremely time consuming to switch cards, reboot, test with different DomUs 
OSes/Drivers, etc. At the moment, once you get a 
Computer/Distribution/Kernel/Xen/Toolstack/DomU OS/Drivers combination that 
works, you simply stick to it, so many early adopters of VGA Passthrough are 
still using extremely outdated versions. Even worse, for users of versions like 
4.2 with xm, if they want to upgrade to 4.4 with xl and want to figure out why 
it doesn't work, it will be a royal pain in the butt to figure out what patch 
was introduced that breaks compatibility for them, so those early adopters are 
pretty much out of luck if they have to go through years worth of code and 
version testing.

3b - Do someone knows what is the actual difference on Intel platforms 
regarding VT-d support? As far that I know, the VT-d specification allows for 
multiple "DMA Remapping Engines", of which a Haswell Processor has two, one for 
its Integrated PCIe Controller and another for the Integrated GPU. You also 
have Chipsets, some of which according to Intel Ark support VT-d (Which I 
believe should be in the form of a third DMA Remapping Engine), like the Q87 
and C226, and those that don't like the H87 and Z87. Based on working samples I 
have been lead to believe than a Processor supporting VT-d will provide the 
IOMMU capabilities for the devices connected to its own PCIe Slots regardless 
of what Chipset you're using (That's the reason why you can do Passthrough with 
only Processor VT-d support), I would believe the same holds true with a VT-d 
Chipset with a non VT-d Processor, through I didn't saw any working example of 
When I was researching about this one year ago, Supermicro support said this to 

Since Z87 chipset does not support VT-d,  onboard LAN will not support it 
either because it is connected to PCH PCIe port.  One workaround is to use a 
VT-d enabled PCIe device and plug it into CPU based PCIe-port on board.  Along 
with a VT-d enabled CPU the above workaround should work per Intel.

Based on this, there should be a not-very-well-documented quirck. The most 
common configuration for VGA Passthrough users is a VT-d supporting Processor 
with a consumer Motherboard, so basically, if you have a VT-d supporting 
Processor like a Core i7 4790K, you can do Passthrough of the devices connected 
to the Processor PCIe Slots, and also of the ones connected to the Chipset if 
you apply that workaround (I don't know what does "VT-d enabled PCIe device" 
means exactly). I recall seeing some people using VMWare ESXi commenting that 
they couldn't passthrough the integrated NIC even through some a RAID 
Controller connected to the Processor could in such setups. Don't have link at 
hand about the matter, but I believe that reelevant for the question.
Considering that if workarounded you would be using the Processor DMA Remapping 
Engine for Chipset devices, is there any potential bottleneck or performance 
degradation there?

3c - There is a feature that enhances VT-d called ACS (Access Control Service), 
related to IOMMU groups isolation. This feature seems to be excluded from 
consumer platforms, and support for it seems to already be on Xen wishlist 
based on comments. Info here:

A curious thing is that if I check /sys/kernel/iommu_groups/ as stated on the 
blog I find the folder empty (This is on Dom0, with a DomU with 2 passthroughed 
devices). I suppose it may be VFIO exclusive or something. Point is, after some 
googling I couldn't find a way to check for IOMMU groups, through Xen doesn't 
seem to manage that anyways. I think that it may be useful to get a layout of 
IOMMU groups to at least identify if passthrough issues could be related to 
that. Anyone can imagine current scenarios where this may break something or 
limit possible passthrough, why I have my IOMMU groups listing empty, and how 
to get such list?

3d - The new SATA Express and M.2 connectors combines SATA and some PCI Express 
lanes on the same connector. Depending on implementation, the PCI Express lanes 
could come from either the Chipset or the Processor. Considering than some 
people likes to passthrough the entire SATA Controller, how does it interacts 
with this frankenstein connector with the PCIe lanes coming from elsewhere? I'm 

   Miscelaneous Virtualization stuff

4a - There are several instances where the Software is trying to check if it is 
under a virtualized enviroment or not. Examples which I recall having read 
about are some malware, which tries to hide if it detects that it is running 
virtualized (Cause it means that it is not your exploitable Average Joe 
computer), or according some comments I read, some Drivers like those of NVIDIA 
to force you to use a Quadro for VGA Passthrough instead of a consumer based 
GeForce. Is the goal of virtualization to reproduce the exact behaviator in a 
VM of a system running native, or just be functionally equivalent? This is 
because as more Software appears that makes a distinction between native and a 
VM, it seems that in the end it will be forcing VMs to look and behave like a 
native system to maintain compatibility. While currently such Software is 
pretty much a specific niche, it exist the possibility than it becomes a trend 
with the growing popularity of the cloud.
For example, one of the things that pretty much tells the whole history, is the 
440FX Chipset, because if you see that Chipset running anything but a P5 
Pentium, you know you're running either emulated or virtualized. Also, if I use 
an application like CPU-Z, it says than the BIOS Brand is Xen, Version 4.3.3, 
which makes the status of the system as inside a VM also obvious. I think that 
based on the rare but existant Software pieces that attempts to check if its 
running on a VM or not to decide behavior, at some point in time a part of the 
virtualization segment will be playing a catching up game to mask being a VM 
from these types of applications. I suppose that a possible endgame for this 
topic would be where you have a VM that tries to represent accurately as 
possible the PCI Layout of a commercial Chipset (Which I believe was one of the 
aims of QEMU Q35 emulation), and deliberately lying and/or masking the 
Processor CPUID data, BIOS vendor, and other recognizable things, to try to 
match what you would expect from a native system of that Hardware generation.
This point could be questionable, cause making a perfect VM that is 
indistinguishable from a native system could harm some vendors that may rely on 
identifying if its running on a VM or not for enforcing licensing and the like.

4b - The only feature which I feel that Xen is missing from a home user 
perspective, is sound. As far that I know you can currently tell QEMU to 
emulate a Sound Card in a DomU, but there is no way to easily get the sound out 
of a DomU like other VMMs do. Some of the solutions I saw relied on either 
multiple passthroughed Sound Cards, or a PulseAudio Server adding massive sound 
latency. While Xen is enterprise oriented where sound is unneeded, I recall 
hearing that this feature was getting considered, but didn't see any mention 
about it for months. How hard or complex it would be to add sound support to 
Xen? Is the way to do it decided? Could it take the form of using Dom0 Drivers 
for the Sound Card to act as a mixer and some PV Drivers for the DomU like the 
ones currently available for the NIC and storage?

I hope someone finds my questions interesing to answer.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.