[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] ATI VGA Passthrough / Xen 4.2 / Linux 3.8.10





On Fri, May 10, 2013 at 3:42 PM, Casey DeLorme <cdelorme@xxxxxxxxx> wrote:

    2) Have you tried disabling IRQ balancing
    (noirqbalance kernel parameter + disable irqbalance service)?


No clue what that is.  Can you provide any direction?  I'd be happy to
test.

In your boot loader, find the kernel and xen lines and add:

On the xen line:
noirqbalance

On the dom0 kernel line:
noirqbalance


How would removing noirqbalance help fix the problem?  Just curious; as I understand it that tool is used to balance requests like a scheduler of sorts.
 

    3) Are you assigning > 4GB of RAM to the guest? I found a post
    in the archive last night mentioning that there's an outstanding qemu
    issue with > 4GB of RAM given to the guest. I didn't get around to
    re-trying the VM with 3.5GB yet.


Yes sir.  It's got 8 GB + 1 GB for the standard video adapter.  Not sure
if that's improper, but it boots just find with a single card, and the
5850 I plugged in for a short while seemed well behaved.  Here's a copy
of my vm config file: http://pastebin.com/bX0ayA0u

I think reducing the guest RAM to 3.5GB is worth a shot, along with only passing a single GPU device.


If I recall the RAM limit is specific to PV guests or older versions of Xen.  I have run Windows with 4, 6, 8, and 16GB of RAM without ever encountering this problem, and this includes tests with the xl toolstack on Xen 4.1.2.
 

        The only single GPU cards I have are the Radeon 5850s in the AMD
        box I
        have.  I'm just a little reticent to tear the thing apart though
        cause
        it gets used a lot.  I think my next step is to look for a video
        card
        that properly supports FLR,


    As far as I can tell, for all the talk of it - there is NO SUCH THING.
    Somebody on the list posted lspci -vvv from their ATI FirePro card
    which shows it has no FLR, and I have just got a Quadro 2000, which also
    lacks FLR.

    The only vague mention I have seen of FLR on GPUs is on the Intel GPU on
    the very latest generation of Core i CPUs (the built in one). And even
    if that is true it's not all that useful for gaming.


Heh.  The crappiest GPU that would ever be in my system is the most
compatible?  Good grief. :P

I'm not sure about compatible, but it seems to have a feature that the others don't - then again, take that with a pinch of salt - I don't have one, and I tend not to believe such things until somebody shows me the lspci dump that proves it.


Where did you find mention of the newer integrated graphics supporting FLR?  I have an IvyBridge 3770 with an HD4000, but when I ran lspci -vv and -vvv I did not see FLReset+, but maybe I did something incorrectly as I also did not see any mention of FLReset anywhere?  

That one got me too for a minute.  You gotta run the lspci -vv[v] as root in order to see that detail.

Doing a

sudo lspci -vv

gets me this, note the DevCap field at the end of the list (which isn't full output for the device, just to show of course):

0e:00.0 Display controller: ATI Technologies Inc Device 671d
Subsystem: ATI Technologies Inc Device 1b2a
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 7
Region 0: Memory at d0000000 (64-bit, prefetchable) [size=256M]
Region 2: Memory at fbcc0000 (64-bit, non-prefetchable) [size=128K]
Region 4: I/O ports at be00 [size=256]
[virtual] Expansion ROM at fbc00000 [disabled] [size=128K]
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- 

I'm still configuring and responding to Gordan, but figured you could use a quick answer just in case you weren't aware of the root req for reading pci features.
 
If the Ivybridge integrated has FLReset I would totally want to test it.  It may not be a powerful chip compared to modern discrete cards, and it won't prove that the lack of FLR is the cause of our AMD/nVidia problems, but it would show the effect the presence of FLR has.
 

                          2) My motherboard's PCIe slots are behind
                NF200 PCIe bridges
                       (yes,
                       EVGA have decided in their infinite wisdom to put
                all 7 PCIe slots
                       behind NF200s, none are directly attached to the
                Intel NB).

                         I'm so sorry :P. NF200 has probably caused a
                lot of xen
                       tinkerers to
                         utter a few dozen cuss words a piece.

                         I can believe that. What is the solution, though?

                         The thing that drives me really nuts about the
                issues I'm seeing
                       (which may or may not be specifically related to
                the NF200) is
                       that it
                       is so intermittent. It works well enough to boot
                up and work with a
                       gaming type load for a few minutes. Then
                something happens that
                       causes
                       the VGA card to require a reset, and it all falls
                apart.

                       My solution was to buy another motherboard, I had
                no luck at all
                       passing the devices behind the NF200, and similar
                to your situation
                       all but one PCIe slot on that board was behind
                that bridge.


                   Did you not manage to get it working at all? Or was
                it just
                   intermittent like in my case? I can typically get
                about 5 minutes of
                   gaming out of my ATI card before it all goes wrong.

                   Ironically, I was thinking about an Asus Sabertooth
                with an 8-core AMD,
                   but opted to go for broke and get a couple of 6-core
                Xeons and an
                   EVGA SR-2. It turns out, a solution that is 4x more
                expensive isn't
                   actually better... :(


                I was unable to get it working at all.  The NF200 simply
                threw errors
                that 100% prevented me from passing the device.  I think
                it was missing
                a number of specific features required for passthrough,
                and I vaguely
                remember running lspci -vvv to verify what was missing.
                  Perhaps not all
                NF200's are created equal?


            The only logged issue I had with the NF200s was the lack of
            ACS, which
            can be disabled as I mentioned on this thread (at least if
            you are using
            the xm stack). After I disabled that PCI passthrough has
            been working OK.
            It's just VGA passthrough BSOD-ing after some minutes that
            is causing me
            problems.


        In reading up on the wiki, there does indeed seem to be a lot more
        info regarding the use of xl and PCI Passthrough today than the last
        time I looked.  It seems that these types of configuration
        options are
        set on a domain-by-domain basis, or even by device; docs say that
        things like VPCI vs direct PASS mapping of slot layout(?) is
        actually
        configured at the device level either in your DomU config file
        (like:
        pci = ['0:d:0.0, pci-just-forking-work-damn-__you]) or via xl
        (like: xl
        pci-attach 1 0:d:0.0 pci-just-forking-work-damn-__you).



    Hmm... I honestly don't think the xl way will succeed where xm is
    unstable,
    but I might give it a shot.


You'd still likely require all the "hacks" you're currently using, but
they'll all move to different places I'm guessing... if the toolstack
itself doesn't have any bearing on this (which is my suspicion) then you
don't want to go doing all the extra work for nothing, of course!

Exactly. And right now what I have read (somebody point me to something that says otherwise), more people seem to have reported success with xm than xl stacks (but that could just be due to the xl stack being much more recent).

I would go as far as to say that most of those reports came from people who used the packaged Xen, and until very recently the packaged Xen was 4.0 or 4.1 where xm is still the default toolstack.

I'd put money on it :P

-Andrew
 
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.