[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] cpuidle and un-eoid interrupts at the local apic
Thimo E. wrote on 2013-09-05: > Hello again, > > the last two weeks no crash with pinning dom0_vcpus_pin and > restricting > dom0 to 1 cpu. But yesterday it crashed again. So changed the command > line again to: > > iommu=no-intremap noirqbalance com1=115200,8n1,0xe050,0 > console=com1,vga mem=1024G dom0_max_vcpus=4 dom0_mem=752M,max:752M > watchdog_timeout=300 lowmem_emergency_pool=1M crashkernel=64M@32M > cpuid_mask_xsave_eax=0 > > And today server crashed again and produced a lot of debugging > messages, see attached. The "..." in the logfiles mean that the > message above the points was repeated very often. > > My summary so far: > - With only 1 cpu atteched to dom0 the server was stable for 2 weeks, > the crash there did not really show any irq problems, see crash20130903.txt > You can find Andrews ideas to this in > http://forums.citrix.com/thread.jspa?messageID=1760771#1760771 - With > more than 1 cpu and irqbalance the server produced the crashes I've > already posted before - Without irqbalance crash with some other fancy > output, see crash20130904.txt > > Next step is to change the network card. > > Zhang, any update from your side ? Or do the others have any idea ? Our hardware guys said they don't aware of such issue with this CPU. We are trying to find the same platform to reproduce now. > Could "ioapic_ack=old" help somewhere ? > > Best regards > Thimo > Am 27.08.2013 03:03, schrieb Zhang, Yang Z: >> Zhang, Yang Z wrote on 2013-08-23: >>> Thimo EichstÃdt wrote on 2013-08-23: >>>> Hello Yang, >>>> >>>> any update from your side ? Did your expert have any idea ? >>>> Possible Hardware problem ? >>> Sorry, no update on this. I am still waiting the answer from hardware team. >> Hi Thimo, >> >> I remember that the CPU always in idle state when this issue happens. >> So can you have a try to disable the C state in Xen to see if it helps? >> >>>> Best regards >>>> Thimo >>>> Am 20.08.2013 10:50, schrieb Zhang, Yang Z: >>>>> Jan Beulich wrote on 2013-08-20: >>>>>>>>> On 20.08.13 at 07:43, Thimo EichstÃdt<thimoe@xxxxxxxxxx> wrote: >>>>>>> (XEN) **Pending EOI error^M (XEN) irq 29, vector 0x21^M (XEN) s[0] >>>>>>> irq 30, vec 0x31, ready 0, ISR 00000001, TMR 00000000, IRR >>>>>>> 00000000^M (XEN) All LAPIC state:^M (XEN) [vector] ISR TMR >>>>>>> IRR^M (XEN) [1f:00] 00000000 00000000 00000000^M (XEN) [3f:20] >>>>>>> 00020002 00000000 00000000^M >>>>>> It ought to be plain impossible to receive an interrupt at vector >>>>>> 0x21 while the ISR bit for vector 0x31 is still set. >>>>>> >>>>>> Intel folks - any input on this? >>>>> I have no idea with this. But I will forward the information to >>>>> some experts internally for help. >>>>> >>>>>> Jan >>>>> Best regards, >>>>> Yang >>>>> >>>>> >>>>> _______________________________________________ >>>>> Xen-devel mailing list >>>>> Xen-devel@xxxxxxxxxxxxx >>>>> http://lists.xen.org/xen-devel >>> >>> Best regards, >>> Yang >>> >> >> Best regards, >> Yang >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@xxxxxxxxxxxxx >> http://lists.xen.org/xen-devel Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |