[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen pci-passthrough problem with pci-detach and pci-assignable-remove
Tuesday, April 1, 2014, 6:13:09 PM, you wrote: > On Thu, Feb 20, 2014 at 05:18:46PM +0100, Sander Eikelenboom wrote: >> >> Thursday, February 20, 2014, 9:53:59 AM, you wrote: >> >> >> > Friday, January 24, 2014, 6:48:06 PM, you wrote: >> >> >> On Fri, Jan 24, 2014 at 02:36:02PM +0100, Sander Eikelenboom wrote: >> >>> >> >>> Friday, January 10, 2014, 6:38:10 PM, you wrote: >> >>> >> >>> >> > Wow. You just walked in a pile of bugs didn't you? And on Friday >> >>> >> > nonethless. >> >>> >> >> >>> >> As usual ;-) >> >>> >> >>> > Ha! >> >>> > ..snip.. >> >>> >> >> [ 489.082358] [<ffffffff81087ac6>] ? >> >>> >> >> mutex_spin_on_owner+0x38/0x45 >> >>> >> >> [ 489.106272] [<ffffffff818e5e22>] ? >> >>> >> >> schedule_preempt_disabled+0x6/0x9 >> >>> >> >> [ 489.130158] [<ffffffff818e7034>] ? >> >>> >> >> __mutex_lock_slowpath+0x159/0x1b5 >> >>> >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >> >>> >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >> >>> >> >> >>> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus >> >>> >> > reset) should also fix. >> >>> >> > I totally forgot about it ! >> >>> >> >> >>> >> Got a link to that patchset ? >> >>> >> >>> > https://lkml.org/lkml/2013/12/13/315 >> >>> >> >>> >> I at least could give it a spin .. you never know when fortune is on >> >>> >> your side :-) >> >>> >> >>> > It is also at this git tree: >> >>> >> >>> > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the >> >>> > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely >> >>> > want to merge it in your current Linus tree. >> >>> >> >>> > Thank you! >> >>> >> >>> >> >>> Hi Konrad, >> >>> >> >>> Just got time to test this some more, when merging this branch *except* >> >>> the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) >> >>> seems to help with my problem,i'm no capable of using: >> >>> - xl pci-detach >> >>> - xl pci-assignable-remove >> >>> - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind >> >>> >> >>> to remove a pci device from a running HVM guest and rebinding it to a >> >>> driver in dom0 without those nasty stacktraces :-) >> >>> So the first 4 seem to be an improvement. >> >>> >> >>> That last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to >> >>> give troubles of it's own. >> >> >> Could you email me your lspci output and also which devices you >> >> move/switch etc? >> >> > Hi Konrad, >> >> > At the moment i found some time to figure out what goes wrong with the xl >> > pci-detach and xl pci-assignable-remove, i have been >> > able to narrow it down a bit: >> >> > The problem only occurs when you: >> > - passthrough 2 (or more?) pci devices assigned to a guest .. >> > - and only remove 1 of those devices with "xl pci-detach" followed by a >> > "xl pci-assignable-remove" >> > - when you first detach both devices with "xl pci-detach" before doing the >> > "xl pci-assignable-remove" it works ok. >> >> > In my case i'm passingthrough 2 devices (02:00.0 and 00:19.0) >> >> > I added some printk's and what i found out is that: >> > - after doing the pci-detach of 02:00.0, it doesn't call >> > pcistub_put_pci_dev for that device ... >> > - but when i subsequently pci-detach the second (and last) device 00:19.0 >> > .. it does call it for both 02:00.0 and 00:19.0 ... >> > - so somehow that call for the first detached device gets deferred .. but >> > since it are different devices and not functions of the same device i don't >> > see any reason for it to wait until all other devices would have been >> > detached ... >> >> >> > I tried to capture the console output but some how that didn't work out, >> > so i attached a screenshot of what happens when: >> > - doing a xl pci-list for the guest >> > - doing a xl pci-assignable-list >> >> > - doing the xl pci-detach for 02:00.0 >> >> > - doing a xl pci-list for the guest >> > - doing a xl pci-assignable-list >> >> > - waiting some time ... >> >> > - doing the xl pci-detach for 00:19.0 >> >> > - doing a xl pci-list for the guest >> > - doing a xl pci-assignable-list >> >> > There you can see this strange sequence of events :-) >> >> > But i haven't been able to spot the culprit >> >> Enabled some extra debugging and added some more printk's .. (see new >> screenshot) >> >> From what it seems .. the frontend state for the first device isn't changed >> on the first pci-detach .. >> >> Is the signaling on pci-detach the guests (pcifront) responsibility or the >> toolstacks (libxl) ? > It usually is pcifront. And in the screenshot I see: > .. frontend is gone! unregister device > which should trigger the process. And it does look to do that. > Hm, I am wondering what the toolstack is waiting for. > Time to debug. Ok thx :-) >> >> >> >> > attached: screenshot.jpg > and thanks for the screenshot (didn't have copy-n-paste option handy :-)) Well i didn't have KVM/SOL working on the intel NUC .. busy with that today .. it's a nifty little machine .. just got to get the AMT/vPro stuff working. So although second best, the screenshot was all i had at that moment. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |