[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] cpuidle and un-eoid interrupts at the local apic
Hello again,the last two weeks no crash with pinning dom0_vcpus_pin and restricting dom0 to 1 cpu. But yesterday it crashed again. So changed the command line again to: iommu=no-intremap noirqbalance com1=115200,8n1,0xe050,0 console=com1,vga mem=1024G dom0_max_vcpus=4 dom0_mem=752M,max:752M watchdog_timeout=300 lowmem_emergency_pool=1M crashkernel=64M@32M cpuid_mask_xsave_eax=0 And today server crashed again and produced a lot of debugging messages, see attached. The "..." in the logfiles mean that the message above the points was repeated very often. My summary so far:- With only 1 cpu atteched to dom0 the server was stable for 2 weeks, the crash there did not really show any irq problems, see crash20130903.txt You can find Andrews ideas to this in http://forums.citrix.com/thread.jspa?messageID=1760771#1760771 - With more than 1 cpu and irqbalance the server produced the crashes I've already posted before - Without irqbalance crash with some other fancy output, see crash20130904.txt Next step is to change the network card. Zhang, any update from your side ? Or do the others have any idea ? Could "ioapic_ack=old" help somewhere ? Best regards Thimo Am 27.08.2013 03:03, schrieb Zhang, Yang Z: Zhang, Yang Z wrote on 2013-08-23:Thimo EichstÃdt wrote on 2013-08-23:Hello Yang, any update from your side ? Did your expert have any idea ? Possible Hardware problem ?Sorry, no update on this. I am still waiting the answer from hardware team.Hi Thimo, I remember that the CPU always in idle state when this issue happens. So can you have a try to disable the C state in Xen to see if it helps?Best regards Thimo Am 20.08.2013 10:50, schrieb Zhang, Yang Z:Jan Beulich wrote on 2013-08-20:On 20.08.13 at 07:43, Thimo EichstÃdt<thimoe@xxxxxxxxxx> wrote:(XEN) **Pending EOI error^M (XEN) irq 29, vector 0x21^M (XEN) s[0] irq 30, vec 0x31, ready 0, ISR 00000001, TMR 00000000, IRR 00000000^M (XEN) All LAPIC state:^M (XEN) [vector] ISR TMR IRR^M (XEN) [1f:00] 00000000 00000000 00000000^M (XEN) [3f:20] 00020002 00000000 00000000^MIt ought to be plain impossible to receive an interrupt at vector 0x21 while the ISR bit for vector 0x31 is still set. Intel folks - any input on this?I have no idea with this. But I will forward the information to some experts internally for help.JanBest regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-develBest regards, YangBest regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel Attachment:
crash20130904.txt Attachment:
crash20130903.txt _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |