[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Some iommu questions(mostly about intel's vt-d)


  • To: xen-users@xxxxxxxxxxxxx
  • From: Gordan Bobic <gordan@xxxxxxxxxx>
  • Date: Tue, 24 Jun 2014 10:15:54 +0100
  • Delivery-date: Tue, 24 Jun 2014 09:16:50 +0000
  • List-id: Xen user discussion <xen-users.lists.xen.org>

On 2014-06-23 20:16, Mihail Ivanov wrote:

I am looking to use Xen with both PV and HVM guests.
I want to pass-through most of my devices to DomU's.
My main concern so far is what hardware should I choose?

If you want things that "just work", get whatever is Citrix
certified for XenServer. With anything else you are rolling
dice, and are liable to run into various hardware, firmware,
of software bugs that you probably don't want to waste days
of your time working around or writing your own patches to
fix.

I've decided on the cpu - Intel i7-4770.
The ram will most likely be 32GB - two kits of Geil 2x8GB 1600-2166 Mhz.

If you are serious about what you are doing and want something
for actual productive purposes, get something that supports ECC
RAM. IMO, hardware that doesn't support ECC RAM is faulty by
design.

Have a read through this before dismissing that view:
http://www.zdnet.com/blog/storage/dram-error-rates-nightmare-on-dimm-street/638

Especially if you are looking at overclocking grade kit. Otherwise
you are liable to be chasing your tail on stability issues for
days/weeks/months looking for a problem that is anywhere but where
you think it is.

About the mobos - what I know so far is that ASUS don't officially
support Vt-d, but Asrock say they do.

I have recently gained the enlightenment to not touch anything made
by Asus with a barge pole. Although their hardware is not bad per
se, their customer service and warranty support are worse than
useless. If you buy anything made by Asus make sure you buy it
via a retailer well known for their customer service reputation,
because that is going to be your only point of resolution for
anything made by Asus.

So I am thinking of getting an Asrock. Thing is - so far I've read
about people using the z87 chipset,
but only one example of z97. Also on Intel's website they are saying
that z87 has Vt-d, but nothing about the z97.

IMO you are looking in the whole wrong ballpark for your hardware
with consumer grade OC-ing kit.

Regardless, the key part is making sure there are no components
on the motherboard that will blow you out of the water when it
comes to VT-d. Nvidia NF200 PCIe bridges are one such item.

Also I've read about the GPU pass-through, so I've decided to use AMD
since nVidia has no support for it
(unless I change the ID of my vga to a quadro or some other
professional vga).

Good luck with that. Results with ATI cards have been patchy at
best and seem to vary wildly with different releases of kernel
and Xen. Nvidia based solutions generally "just work". If you
don't want to do any soldering, get a GTX480 and soft-strap it
into a Quadro 6000. Or if you are on a budget,
GTX470 -> Quadro 5000 or
GTS450 (GF106 based only) -> Quadro 2000

Not having an FLR isn't an issue, correct?

No, it is not a necessary requirement.

And now my main concern - which of these devices can be used with
pass-through:
the integrated sound card
the integrated NIC
the sata controllers(expensive boards have two - one from the chipset
and another one)
the usb controllers

All of the above. On my system I am passing through the integrated
Intel HD audio, two USB ports and a GPU to one VM, and a GPU and
a whole USB3 controller to another VM.

I don't pass through whole SATA controllers because I prefer to
have my VMs backed by ZFS volumes with deduplication (on SSD,
so the performance isn't utterly crippling, and the deduplication
ratio is pretty much the number of VMs)

My motherboard has two Marvell NICs, but I have them bonded and
bridged to the virtual VM NICs, with PV drivers. This produces
reasonably good results for me.

All of those should be PCI devices, correct?

You can pass individual USB devices through, but when I tried to
do it that way things were a little more quirky, so I just stuck
with passing through PCIe USB devices corresponding to the
ports.

Also in an article in wikipedia, they say that some mobos don't have
support for Vt-d for example on PCIe x8 or mini?
How come only one or two slots can't work?

Nvidia NF200 (or similarly broken) PCIe bridges. They are used to
multiplex out the PCIe lanes to more lanes than the root PCIe hub
has. At a glance this is pointless because you still only have so
much bandwidth at the root hub. But NF200 traffic seems to outright
bypass the root hub for DMA transfers, which means it can avoid
some of the bottleneck. The bad part is that the traffic that
doesn't go via the root hub cannot be subject to VT-d translation
which will cause PCI I/O memory overwriting and crash the host (and
depending on how unlucky you are, potentially trash your data if it
happens to overwrite the I/O memory used by your disk controller.

Now, having said all that - I have an EVGA SR-2 on which _ALL_ of
the PCIe slots are behind NF200 bridges, and I have it working
reasonably well with a custom bodge patch. With a bit of luck
this is no longer going to be required when a Xen release arrives
with the recently developed feature to limit memory below 4GB.

The trick is to ensure that the PCI I/O memory is not overlapped
by any memory in the VM. If you have a PCI device with I/O memory
mapped at, say, 2.5GB, and it is the first block of I/O memory
on the host, then having a VM with up to 2.5GB of RAM will work
fine. Any more than that, and the domU will overwrite the PCI
I/O memory of the device, and crash the PCI device and almost
certainly the whole host with it. The patches mentioned ensure
the memory hole in domU is such that it covers all of the host
I/O memory to prevent it from getting clobbered by the domU
memory.

This _can_ be fixed and worked around - but if your time isn't
worthless, just avoiding anything with NF200 bridges is probably
a much saner proposition.

Unfortunately, my only experience is with hardware that is
broken in this way. It was a fun project but in retrospect,
if you assign any value to your time you will probably be
much better off buying something that has been extensively
tested and proven to work. That may mean getting hardware
that is a generation or two behind, or getting something
certified by a vendor. Do not underestimate just how buggy
hardware is these days, the moment you start to stray from
the most common and basic of uses.

Also can I passtrough just one USB device or I have to do it with the
whole controller?(as a PCI device I guess).

You can pass USB device "functions" individually - or at least
it works on my machines:

# lspci | grep USB
00:1a.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4 00:1a.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5 00:1a.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6 00:1a.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2 00:1d.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1 00:1d.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2 00:1d.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3 00:1d.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1

You can pass each of those independently to different VMs,
i.e. you don't have to restrict yourself to passing all
00:1a.* devices to one VM. On my board two of the physical
ports correspond to one of those IDs, so I pass that through
with a mouse and keyboard attached to one of the domUs, and
use the other ports/devices for similar things in dom0 and
on other domUs.

Gordan

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.