[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Issue with pv_ops Kernel 2.6.31.6 and Xen
Sorry, this is a duplicate of http://lists.xensource.com/archives/html/xen-devel/2010-01/msg00855.html Thought that this mail did not reach the mailing list, so I reposted it... Marcial Rion wrote: > Hi > > First of all I have to state that I am neither a Kernel nor a Xen > developer. Nevertheless, while trying to use Kernel 2.6.31.6 from > git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git as a Dom0 > Kernel, I discovered an issue and searching the Internet for a long > time, I probably also found the cause. However, I won't be able to fix > it by myself :-(, so I am trying to share my knowledge with this list, > in the hope that the issue might gets fixed sometime :-)... > I will try to give you all information that seems relevant to me; > however, if it turns out I missed to give enough details about my system > (configuration), log files or anything else, I will be glad to provide > this information. Furthermore, I would also be happy to support > "testing" of potential patches if this is required. I post to this list > as this has been suggested at > http://wiki.xensource.com/xenwiki/XenParavirtOps (bottom of page). If I > am wrong, please give me a short hint so I wont bother you any longer... > > Now, let's get into it... > > About my system: > I am running Gentoo (10.0, server profile) on an Asus P2B-D motherboard > (PIIX4 chipset) with two PIII 500 MHz CPUs and 1G of RAM. The system > furthermore possesses 3 PCI network interfaces of chip type Realtek RLT > 8139 (rlt8139too Kernel driver). Network interface to be used is eth0 (I > already tried whether using another interface as eth0 would change > anything - without success :-( ). > > The issue I have: > While Xen pv_ops Kernel 2.6.31.6 perfectly runs on bare metal, it fails > to get network connectivity when run on top of Xen 3.4.1 (Gentoo default > installation). Though the system seems to come up correctly at a first > sight and network interface is available (I can ping it locally), access > to network fails (I cannot ping other system in the network nor vice-versa). > > What I discovered so far: > Consulting the boot messages within "dmesg", I discovered that ACPI SCI > fails to load when run on top of Xen, while this error is not happening > on bare metal. > > With XEN: > ********* > bio: create slab <bio-0> at 0 > ACPI: SCI (IRQ20) allocation failed > ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control > Interrupt handler 20090521 evevent-161 > ACPI: Unable to start the ACPI Interpreter > ------------[ cut here ]------------ > WARNING: at lib/kobject.c:595 kobject_put+0x27/0x3c() > Hardware name: System Name > kobject: '<NULL>' (cf805ea0): is not initialized, yet kobject_put() is > being called. > Modules linked in: > Pid: 1, comm: swapper Tainted: G W 2.6.31.6 #14 > Call Trace: > [<c043a2db>] warn_slowpath_common+0x60/0x90 > [<c043a33f>] warn_slowpath_fmt+0x24/0x27 > [<c05588cb>] kobject_put+0x27/0x3c > [<c049e502>] kmem_cache_destroy+0x105/0x11b > [<c058adc8>] acpi_os_delete_cache+0x8/0xc > [<c05a6fe6>] acpi_ut_delete_caches+0xd/0x6b > [<c05a77f7>] acpi_ut_subsystem_shutdown+0x87/0x90 > [<c0904837>] ? acpi_init+0x0/0x263 > [<c05a8067>] acpi_terminate+0x8/0x14 > [<c09049cb>] acpi_init+0x194/0x263 > [<c05f0e66>] ? __class_create+0x44/0x5e > [<c09021c5>] ? fbmem_init+0x0/0x78 > [<c0904837>] ? acpi_init+0x0/0x263 > [<c0403051>] do_one_initcall+0x4c/0x13a > [<c08e030d>] kernel_init+0x12c/0x17d > [<c08e01e1>] ? kernel_init+0x0/0x17d > [<c040ad17>] kernel_thread_helper+0x7/0x10 > ---[ end trace 4eaa2a86a8e2da23 ]--- > ------------[ cut here ]------------ > WARNING: at lib/kobject.c:595 kobject_put+0x27/0x3c() > Hardware name: System Name > kobject: '<NULL>' (cf805f60): is not initialized, yet kobject_put() is > being called. > Modules linked in: > Pid: 1, comm: swapper Tainted: G W 2.6.31.6 #14 > Call Trace: > [<c043a2db>] warn_slowpath_common+0x60/0x90 > [<c043a33f>] warn_slowpath_fmt+0x24/0x27 > [<c05588cb>] kobject_put+0x27/0x3c > [<c049e502>] kmem_cache_destroy+0x105/0x11b > [<c058adc8>] acpi_os_delete_cache+0x8/0xc > [<c05a700e>] acpi_ut_delete_caches+0x35/0x6b > [<c05a77f7>] acpi_ut_subsystem_shutdown+0x87/0x90 > [<c0904837>] ? acpi_init+0x0/0x263 > [<c05a8067>] acpi_terminate+0x8/0x14 > [<c09049cb>] acpi_init+0x194/0x263 > [<c05f0e66>] ? __class_create+0x44/0x5e > [<c09021c5>] ? fbmem_init+0x0/0x78 > [<c0904837>] ? acpi_init+0x0/0x263 > [<c0403051>] do_one_initcall+0x4c/0x13a > [<c08e030d>] kernel_init+0x12c/0x17d > [<c08e01e1>] ? kernel_init+0x0/0x17d > [<c040ad17>] kernel_thread_helper+0x7/0x10 > ---[ end trace 4eaa2a86a8e2da24 ]--- > sync cpu 0 get result ffffffff max_id 0 > Failed to sync pcpu 0 > xenbus_probe_backend_init bus registered ok > > > Wihout Xen: > *********** > bio: create slab <bio-0> at 0 > ACPI: EC: Look up EC in DSDT > ACPI: Interpreter enabled > ACPI: (supports S0 S5) > ACPI: Using IOAPIC for interrupt routing > ACPI: No dock devices found. > ACPI: PCI Root Bridge [PCI0] (0000:00) > pci 0000:00:00.0: reg 10 32bit mmio: [0xf8000000-0xfbffffff] > pci 0000:00:04.1: reg 20 io port: [0xb800-0xb80f] > pci 0000:00:04.2: reg 20 io port: [0xb400-0xb41f] > * Found PM-Timer Bug on the chipset. Due to workarounds for a bug, > * this clock source is slow. Consider trying other clock sources > pci 0000:00:04.3: quirk: region e400-e43f claimed by PIIX4 ACPI > pci 0000:00:04.3: quirk: region e800-e80f claimed by PIIX4 SMB > pci 0000:00:04.3: PIIX4 devres B PIO at 0290-0297 > pci 0000:00:09.0: reg 10 io port: [0xb000-0xb0ff] > pci 0000:00:09.0: reg 14 32bit mmio: [0xde800000-0xde8000ff] > pci 0000:00:09.0: reg 30 32bit mmio: [0x000000-0x00ffff] > pci 0000:00:0a.0: reg 10 io port: [0xa800-0xa8ff] > pci 0000:00:0a.0: reg 14 32bit mmio: [0xde000000-0xde0000ff] > pci 0000:00:0a.0: supports D1 D2 > pci 0000:00:0a.0: PME# supported from D1 D2 D3hot > pci 0000:00:0a.0: PME# disabled > pci 0000:00:0b.0: reg 10 io port: [0xa400-0xa4ff] > pci 0000:00:0b.0: reg 14 32bit mmio: [0xdd800000-0xdd8000ff] > pci 0000:00:0b.0: supports D1 D2 > pci 0000:00:0b.0: PME# supported from D1 D2 D3hot > pci 0000:00:0b.0: PME# disabled > pci 0000:01:00.0: reg 10 32bit mmio: [0xe0000000-0xe3ffffff] > pci 0000:01:00.0: reg 14 32bit mmio: [0xdf800000-0xdf87ffff] > pci 0000:01:00.0: reg 18 io port: [0xd800-0xd8ff] > pci 0000:01:00.0: reg 30 32bit mmio: [0xdf7e0000-0xdf7fffff] > pci 0000:01:00.0: supports D1 D2 > pci 0000:00:01.0: bridge io port: [0xd000-0xdfff] > pci 0000:00:01.0: bridge 32bit mmio: [0xf4000000-0xf40fffff] > pci 0000:00:01.0: bridge 32bit mmio pref: [0xdf700000-0xe3ffffff] > pci_bus 0000:00: on NUMA node 0 > ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] > ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12 14 15) > ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 *10 11 12 14 15) > ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 *12 14 15) > ACPI: PCI Interrupt Link [LNKD] (IRQs 3 *4 5 6 7 9 10 11 12 14 15) > xenbus_probe_backend_init bus registered ok > > > Respective to the error, the /proc/interrupts tables were also different: > > With XEN: > ********* > CPU0 CPU1 > 1: 426 0 xen-pirq-ioapic-edge i8042 > 3: 0 0 xen-pirq-ioapic-edge uhci_hcd:usb1 > 4: 2 0 xen-pirq-ioapic-edge serial > 8: 2 0 xen-pirq-ioapic-edge rtc0 > 12: 0 0 xen-pirq-ioapic-edge eth0 > 14: 4319 0 xen-pirq-ioapic-edge ide0 > 15: 42 0 xen-pirq-ioapic-edge ide1 > 411: 0 0 xen-dyn-event xenbus > 412: 0 703 xen-dyn-ipi callfuncsingle1 > 413: 0 0 xen-dyn-virq debug1 > 414: 0 0 xen-dyn-ipi callfunc1 > 415: 0 45622 xen-dyn-ipi resched1 > 416: 0 311 xen-dyn-ipi spinlock1 > 417: 0 153289 xen-dyn-virq timer1 > 418: 550 0 xen-dyn-ipi callfuncsingle0 > 419: 0 0 xen-dyn-virq debug0 > 420: 0 0 xen-dyn-ipi callfunc0 > 421: 18071 0 xen-dyn-ipi resched0 > 422: 661 0 xen-dyn-ipi spinlock0 > 423: 277476 0 xen-dyn-virq timer0 > NMI: 0 0 Non-maskable interrupts > LOC: 0 0 Local timer interrupts > SPU: 0 0 Spurious interrupts > CNT: 0 0 Performance counter interrupts > PND: 0 0 Performance pending work > RES: 18071 45622 Rescheduling interrupts > CAL: 550 703 Function call interrupts > TLB: 0 0 TLB shootdowns > TRM: 0 0 Thermal event interrupts > THR: 0 0 Threshold APIC interrupts > MCE: 0 0 Machine check exceptions > MCP: 132 132 Machine check polls > ERR: 0 > MIS: 0 > > > Without XEN: > ************ > CPU0 CPU1 > 0: 46 0 IO-APIC-edge timer > 1: 2567 4239 IO-APIC-edge i8042 > 6: 3 0 IO-APIC-edge floppy > 8: 1 1 IO-APIC-edge rtc0 > 14: 28604 27089 IO-APIC-edge ide0 > 15: 0 0 IO-APIC-edge ide1 > 18: 1942 1978 IO-APIC-fasteoi eth0 > 20: 0 0 IO-APIC-fasteoi acpi > NMI: 0 0 Non-maskable interrupts > LOC: 1097380 1052641 Local timer interrupts > SPU: 0 0 Spurious interrupts > CNT: 0 0 Performance counter interrupts > PND: 0 0 Performance pending work > RES: 105211 107135 Rescheduling interrupts > CAL: 16 20 Function call interrupts > TLB: 4542 4509 TLB shootdowns > TRM: 0 0 Thermal event interrupts > THR: 0 0 Threshold APIC interrupts > MCE: 0 0 Machine check exceptions > MCP: 289 289 Machine check polls > ERR: 0 > MIS: 0 > > > Searching the Internet, I ran across different messages (i.e. > http://www.mail-archive.com/kvm@xxxxxxxxxxxxxxx/msg26601.html) > mentioning that on motherboards with the PIIX4 chipset SCI interrupt is > hardwired to IRQ 9. However, on my system it is assigned IRQ 20 on bare > metal, and fails to be set to IRQ 20 on top of Xen (see extract above of > dmesg when run on top of Xen -> ACPI: SCI (IRQ20) allocation failed). > > As I started wondering whether it would work with IRQ 9 and having no > knowledge of ACPI and interrupt handling in the Kernel, I badly fixed > the code of <Kernel-DIR>/drivers/acpi/osl.c in the following manner: > > osl.c:391 > ********* > acpi_status > acpi_os_install_interrupt_handler(u32 gsi, acpi_osd_handler handler, > void *context) > { > unsigned int irq; > > acpi_irq_stats_init(); > > /* > * Ignore the GSI from the core, and use the value in our copy > of the > * FADT. It may not be the same if an interrupt source override > exists > * for the SCI. > */ > gsi = acpi_gbl_FADT.sci_interrupt; > if (acpi_gsi_to_irq(gsi, &irq) < 0) { > printk(KERN_ERR PREFIX "SCI (ACPI GSI %d) not registered\n", > gsi); > return AE_OK; > } > + irq = 9; > acpi_irq_handler = handler; > acpi_irq_context = context; > if (request_irq(irq, acpi_irq, IRQF_SHARED, "acpi", acpi_irq)) { > printk(KERN_ERR PREFIX "SCI (IRQ%d) allocation > failed\n", irq); > return AE_NOT_ACQUIRED; > } > acpi_irq_irq = irq; > > return AE_OK; > } > > > As you can see, I just "overwrote" the IRQ number somehow evaluated by > the system with IRQ 9, recompiled the Kernel and discovered(!) that > networking was now working, even within Xen (btw: it was still working > on bare metal). > > Now I don't know why it is working with SCI mapped to IRQ 20 on bare > metal while SCI is supposed to be hardwired to IRQ 9, but the fact that > it works in both cases with IRQ 9 suggests me there is something "wrong" > or at least different when pv_ops Kernel 2.6.31.6 is run on top of Xen. > So someone somewhen might have a look at it, because that's where my > knowledge stops... > > Thanks & regards, > Marcial > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |