[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Status of FLR in Xen 4.4



Hi Gordon,

I tried your patch on my dom0 kernel and I think it somehow helped in the sense that now I can reboot the domUs now without crashing the whole host, but linux domU still gets a blackscreen and windows7 domU only starts till black screen with (actual movable) cursor, but not furthor.. this might only be a coincidence, though, have to double check this..

I tried some other stuff, too:

1) after domU shutdown rebind both functions to the dom0 drivers, do a sysfs reset and re-add to assignable devices -> crashes dom0
2) after domU shutdown rebind both functions to the dom0 drivers and readd to assignable devices -> dom0 crashes somtime when domU using the devices comes up, sometimes not, but no success either way
3) sysfs reset of the devices within domU seems to be passed through dom0 (see commands in qemu-log) but no effect

Also, I analysed your code and compared it to the stuff in the python tools of xm and it is the same approach and i don't see any obvious differences.. Then I tried to replicate the secondary bus reset on command lind for testing purposes via

printf '\x40' | dd of=/sys/devices/pci0000\:00/0000\:00\:0b.0/config bs=1 seek=$((0x3e)) count=1 conv=notrunc

but I think I got some endians or offset slightly wrong because after that xl refuses to give the device (00:0b.0 is the bus of my 2-function vga card I have assigned to my domU) to the domU and later crashes dom0.

So I'm a little lost at that point and would welcome some suggestions.

Does FLR reset works for any of you for vga cards?


2013/9/26 Gordan Bobic <gordan@xxxxxxxxxx>
On 09/26/2013 07:41 PM, Matthias wrote:
Hi,

thanks for your answers, the cards are a AMD HD 5750 and a HD 5400, both
with dual functions (due to audio capabilities), both co-assigned to
their respective domU and both not capable of FLR from lspci -vvv output.

also, @Ross, I'm running a 3.8.2 Kernel, so this should be fine, but I
assume that the 'official' command where xl asks the dom0 about the
reset do not work (if I have understand david correctly) since it's dual
function so no dual bus reset is actually executed causing the
misbehaviour, and on the other side xm doing a bus reset so it works in
this specific case.

I'm currently recompiling the kernel to see if your patch works David.

Also, just to understand it better, is the secondary bus reset the thing
which you can manually invoke via /sys/bus/pci/devices/.../reset ?

So as a workaround, would the following work in principle?

xl pci-assignable-remove 0X:00.0
xl pci-assignable-remove 0X:00.1
echo "1" > /sys/bus/pci/devices/0X:00.0/reset
echo "1" > /sys/bus/pci/devices/0X:00.1/reset

This bit is up to the driver to implement. Since pciback is a placeholder rather than a driver that knows about the hardware the reset node won't be there.

You could try to do something with setpci to force the registers between D0 and D3 power states in a vague hope that might do something, but I doubt it.

The reason nvidia cards work OK is because the domU driver knows how to reinitialize the hardware and acts accordingly. If the manufacturer won't implement a standard function to reset the hardware, then it is up to their drivers to handle the situation.

As a workaround, if (on Windows domUs) ejecting the card before shutdown/reboot of domU works, you could probably write some powershell magic that does that on shutdown/reboot as a reasonable workaround.

Gordan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.