[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] [BUG] pci-passthrough on dom0 kernel versions above 3.8 crashes dom0
Let me know if I can do anything to assist.
Thanks
On 04/10/13 11:05, Jan Beulich wrote:
On 04.10.13 at 09:44, Kristoffer Egefelt <kristoffer@xxxxxxx> wrote:
Hi,
I'm trying to pass through a NIC (intel X520 with ixgbevf driver) to domU, but since kernel 3.8 this has not worked.
The dom0 kernel seems to cause the problem. Xen version, domU kernel version and driver version seems to be unrelated to this bug, meaning it works as long as dom0 kernel is 3.8. I tried kernel version 3.9, 3.10 and 3.11 - all show the same bug pattern when used as dom0.
The BUG appears on xl pci attach. On pci detach the dom0 panics.
I have attached logs from a working setup (kernel 3.8) and from a setup not working (kernel 3.11) and also the kernel config for 3.11.
In short, this is what domU logs after pci attach:
BUG: unable to handle kernel paging request at ffffc9000030200c IP: [<ffffffff81205812>] __msix_mask_irq+0x21/0x24 PGD 75a40067 PUD 75a41067 PMD 75b44067 PTE 8010000000000464 Oops: 0003 [#1] SMP Modules linked in: ixgbevf(+) xen_pcifront nfnetlink_log nfnetlink ipt_ULOG x_tables x86_pkg_temp_thermal thermal_sys coretemp crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper microcode ext4 crc16 jbd2 mbcache xen_blkfront CPU: 0 PID: 2122 Comm: modprobe Not tainted 3.11.3-kernel-v1.0.0.21+ #1
Are you certain this is kernel (rather than hypervisor) version dependent? Iirc this is a manifestation of a guest kernel not being permitted to write to the MSI-X mask bit.
And this is dom0 on pci detach:
(XEN) Assertion '_raw_spin_is_locked(lock)' failed at /usr/src/xen/xen/include/asm/spinlock.h:16402 (XEN) ----[ Xen-4.4-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 1 (XEN) RIP: e008:[<ffff82d0801258ef>] _spin_unlock_irqrestore+0x27/0x32 (XEN) RFLAGS: 0000000000010202 CONTEXT: hypervisor (XEN) rax: 0000000000000001 rbx: ffff83201ba07724 rcx: 0000000000000001 (XEN) rdx: ffff83201bb97020 rsi: 0000000000000286 rdi: ffff83201ba07724 (XEN) rbp: ffff83203ffcfdd8 rsp: ffff83203ffcfdd8 r8: ffff8141002000e0 (XEN) r9: 000000000000001c r10: 0000000000000082 r11: 0000000000000001 (XEN) r12: 0000000000000000 r13: ffff8320e13c8240 r14: ffff880148047df4 (XEN) r15: 0000000000000286 cr0: 0000000080050033 cr4: 00000000000426f0 (XEN) cr3: 000000206f3ff000 cr2: 00007fa5ec560c49 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff83203ffcfdd8: (XEN) ffff83203ffcfe68 ffff82d080166a4f ffff83203ffcfe18 0000000280118988 (XEN) 0000000000000cfe 0000000000000cfe ffff832015d3b8a0 ffff8320e13c83f0 (XEN) ffff832015d3b880 0000000000000001 00000000fee00678 0000000000000000 (XEN) ffff83200000f800 000000000000001b ffff8300bcef5000 ffffffffffffffed (XEN) ffff880148047df4 ffffffff814530e0 ffff83203ffcfef8 ffff82d08017dee4 (XEN) ffff832000000002 0000000000000008 ffff83203ffcfef8 ffff82d000a0fb00 (XEN) 0000000000000000 ffffffff93010000 ffff82d0802e8000 ffff83203ffc80ef (XEN) 82d080222c00b948 c390ef66d1ffffff ffff83203ffcfef8 ffff8300bcef5000 (XEN) ffff880145951868 ffff880145bb2a60 ffff880148047f50 ffffffff814530e0 (XEN) 00007cdfc00300c7 ffff82d08022213b ffffffff8100142a 0000000000000021 (XEN) ffffffff814530e0 ffff880148047f50 000000000000c002 0000000000009300 (XEN) ffff88013faf1a80 ffff880145951000 0000000000000202 0000000000000093 (XEN) ffff880148047df4 0000000000000002 0000000000000021 ffffffff8100142a (XEN) 0000000000000000 ffff880148047df4 000000000000001b 0001010000000000 (XEN) ffffffff8100142a 000000000000e033 0000000000000202 ffff880148047dc8 (XEN) 000000000000e02b 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000001 ffff8300bcef5000 0000004f9b885e00 (XEN) 0000000000000000 (XEN) Xen call trace: (XEN) [<ffff82d0801258ef>] _spin_unlock_irqrestore+0x27/0x32 (XEN) [<ffff82d080166a4f>] pci_restore_msi_state+0x1c9/0x2f0 (XEN) [<ffff82d08017dee4>] do_physdev_op+0xe4f/0x114f (XEN) [<ffff82d08022213b>] syscall_enter+0xeb/0x145 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 1: (XEN) Assertion '_raw_spin_is_locked(lock)' failed at /usr/src/xen/xen/include/asm/spinlock.h:16402 (XEN) **************************************** (XEN) (XEN) Manual reset required ('noreboot' specified)
This, otoh, is clearly a hypervisor bug. Afaict the patch below should help.
But - this code is supposed to be executed on host S3 resume only (i.e. there might also be some kernel flaw involved here).
It's called from pci_restore_state() which is called from pciback when a device is released. This doesn't seem unreasonable to me.
David
|
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|