[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-users] ATI VGA Passthrough / Xen 4.2 / Linux 3.8.10
On 05/10/2013 08:42 PM, Casey DeLorme wrote:
2) Have you tried disabling IRQ balancing
(noirqbalance kernel parameter + disable irqbalance service)?
No clue what that is. Can you provide any direction? I'd be
happy to
test.
In your boot loader, find the kernel and xen lines and add:
On the xen line:
noirqbalance
On the dom0 kernel line:
noirqbalance
How would removing noirqbalance help fix the problem? Just curious; as
I understand it that tool is used to balance requests like a scheduler
of sorts.
I am purely guessing here, but could it be possible that if the VM uses
a CPU other than the CPU that handles the interrupts for the hardware it
has been passed strange things happen, possibly more so if the CPU in
question is not only not the same core, but not even the same socket.
It's possible that disabling rotating the interrupt handling between the
cores alleviates an issue with IRQ routing.
But take this with a bucket of salt - I am _purely_ guessing here.
3) Are you assigning > 4GB of RAM to the guest? I found a post
in the archive last night mentioning that there's an
outstanding qemu
issue with > 4GB of RAM given to the guest. I didn't get
around to
re-trying the VM with 3.5GB yet.
Yes sir. It's got 8 GB + 1 GB for the standard video adapter.
Not sure
if that's improper, but it boots just find with a single card,
and the
5850 I plugged in for a short while seemed well behaved. Here's
a copy
of my vm config file: http://pastebin.com/bX0ayA0u
I think reducing the guest RAM to 3.5GB is worth a shot, along with
only passing a single GPU device.
If I recall the RAM limit is specific to PV guests or older versions of
Xen. I have run Windows with 4, 6, 8, and 16GB of RAM without ever
encountering this problem, and this includes tests with the xl toolstack
on Xen 4.1.2.
Including VGA passthrough on those guests?
The only single GPU cards I have are the Radeon 5850s
in the AMD
box I
have. I'm just a little reticent to tear the thing
apart though
cause
it gets used a lot. I think my next step is to look
for a video
card
that properly supports FLR,
As far as I can tell, for all the talk of it - there is NO
SUCH THING.
Somebody on the list posted lspci -vvv from their ATI
FirePro card
which shows it has no FLR, and I have just got a Quadro
2000, which also
lacks FLR.
The only vague mention I have seen of FLR on GPUs is on the
Intel GPU on
the very latest generation of Core i CPUs (the built in
one). And even
if that is true it's not all that useful for gaming.
Heh. The crappiest GPU that would ever be in my system is the most
compatible? Good grief. :P
I'm not sure about compatible, but it seems to have a feature that
the others don't - then again, take that with a pinch of salt - I
don't have one, and I tend not to believe such things until somebody
shows me the lspci dump that proves it.
Where did you find mention of the newer integrated graphics supporting
FLR? I have an IvyBridge 3770 with an HD4000, but when I ran lspci -vv
and -vvv I did not see FLReset+, but maybe I did something incorrectly
as I also did not see any mention of FLReset anywhere? If the Ivybridge
integrated has FLReset I would totally want to test it. It may not be a
powerful chip compared to modern discrete cards, and it won't prove that
the lack of FLR is the cause of our AMD/nVidia problems, but it would
show the effect the presence of FLR has.
I came across a post on a forum or a miling list from someone after
googling something like
"GPU" "FLreset+"
and then trawling to a few hundred pages to find one that actually lists
lspci output that is referring to a GPU.
Having said that, I have also found references to people claiming that
FirePro and Quadro cars have FLR, which is quite clearly not the case.
So let's not assume that it's true just because somebody on the internet
said so. :)
2) My motherboard's PCIe slots are behind
NF200 PCIe bridges
(yes,
EVGA have decided in their infinite
wisdom to put
all 7 PCIe slots
behind NF200s, none are directly
attached to the
Intel NB).
I'm so sorry :P. NF200 has probably
caused a
lot of xen
tinkerers to
utter a few dozen cuss words a piece.
I can believe that. What is the
solution, though?
The thing that drives me really nuts
about the
issues I'm seeing
(which may or may not be specifically
related to
the NF200) is
that it
is so intermittent. It works well enough
to boot
up and work with a
gaming type load for a few minutes. Then
something happens that
causes
the VGA card to require a reset, and it
all falls
apart.
My solution was to buy another
motherboard, I had
no luck at all
passing the devices behind the NF200,
and similar
to your situation
all but one PCIe slot on that board was
behind
that bridge.
Did you not manage to get it working at all?
Or was
it just
intermittent like in my case? I can
typically get
about 5 minutes of
gaming out of my ATI card before it all goes
wrong.
Ironically, I was thinking about an Asus
Sabertooth
with an 8-core AMD,
but opted to go for broke and get a couple
of 6-core
Xeons and an
EVGA SR-2. It turns out, a solution that is
4x more
expensive isn't
actually better... :(
I was unable to get it working at all. The
NF200 simply
threw errors
that 100% prevented me from passing the device.
I think
it was missing
a number of specific features required for
passthrough,
and I vaguely
remember running lspci -vvv to verify what was
missing.
Perhaps not all
NF200's are created equal?
The only logged issue I had with the NF200s was the
lack of
ACS, which
can be disabled as I mentioned on this thread (at
least if
you are using
the xm stack). After I disabled that PCI
passthrough has
been working OK.
It's just VGA passthrough BSOD-ing after some
minutes that
is causing me
problems.
In reading up on the wiki, there does indeed seem to be
a lot more
info regarding the use of xl and PCI Passthrough today
than the last
time I looked. It seems that these types of configuration
options are
set on a domain-by-domain basis, or even by device;
docs say that
things like VPCI vs direct PASS mapping of slot
layout(?) is
actually
configured at the device level either in your DomU
config file
(like:
pci = ['0:d:0.0, pci-just-forking-work-damn-____you])
or via xl
(like: xl
pci-attach 1 0:d:0.0 pci-just-forking-work-damn-____you).
Hmm... I honestly don't think the xl way will succeed where
xm is
unstable,
but I might give it a shot.
You'd still likely require all the "hacks" you're currently
using, but
they'll all move to different places I'm guessing... if the
toolstack
itself doesn't have any bearing on this (which is my suspicion)
then you
don't want to go doing all the extra work for nothing, of course!
Exactly. And right now what I have read (somebody point me to
something that says otherwise), more people seem to have reported
success with xm than xl stacks (but that could just be due to the xl
stack being much more recent).
I would go as far as to say that most of those reports came from people
who used the packaged Xen, and until very recently the packaged Xen was
4.0 or 4.1 where xm is still the default toolstack.
Which I don't find to be in any way an encouragement to even attempt to
do this using the xl tool stack at the moment. :)
Gordan
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users
|