[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] cpuidle and un-eoid interrupts at the local apic



Hello again,

I've disabled the internal network card and used another one, problem still exists. I had two crashed during 5 minutes, frustrating. So (assuming disabling the internal card in the bios is working) the source of the problem is not the internal NIC.

Every time the pending EOI error occurs I see the mysterious interrupt >>29<<. Only the vectors are changing. See below a summary of the last 5 crashes.

My Questions:
- How can I see to which hardware device int 29 belongs ? I can't find int 29 in /proc/interrupts or lspci -vv nor in kernel dmesg or xen dmesg ?!?! - Andrew, what does your output "domain-list=0:276" mean and why is it alway 0:276 for interrupt 29 ? Is it the VM number ?

1)
(XEN)   irq 29, vector 0x21
(XEN) IRQ: 29 affinity:4 vec:21 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:276(----),

2)
(XEN)   irq 29, vector 0x26
(XEN) IRQ: 29 affinity:8 vec:26 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:276(----),

3)
(XEN)   irq 29, vector 0x31
(XEN) IRQ: 29 affinity:2 vec:24 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:276(----),

4)
(XEN)   irq 29, vector 0x2e
(XEN) IRQ: 29 affinity:8 vec:7e type=PCI-MSI status=00000010 in-flight=0 domain-list=0:276(----),

5)
(XEN)   irq 29, vector 0x3b
(XEN) IRQ: 29 affinity:2 vec:3b type=PCI-MSI status=00000010 in-flight=0 domain-list=0:276(----),



Best regards
  Thimo



Am 14.08.2013 11:52, schrieb Andrew Cooper:
On 14/08/13 03:53, Zhang, Yang Z wrote:
Andrew Cooper wrote on 2013-08-12:

On the XenServer hardware where we have seen this issue, the
problematic interrupt was from:

00:19.0 Ethernet controller: Intel Corporation Ethernet Connection
I217-LM (rev 02) Subsystem: Intel Corporation Device 0000 Control: I/O+
Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR-
FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin
A routed to IRQ 1275 Region 0: Memory at c2700000 (32-bit,
non-prefetchable) [size=128K] Region 1: Memory at c273e000 (32-bit,
non-prefetchable) [size=4K] Region 2: I/O ports at 7080 [size=32]
Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1-
D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst-
PME-Enable- DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable+
Count=1/1 Maskable- 64bit+ Address: 00000000fee00318 Data: 0000
Capabilities: [e0] PCI Advanced Features AFCap: TP+ FLR+ AFCtrl: FLR-
AFStatus: TP- Kernel driver in use: e1000e Kernel modules: e1000e

I am still attempting to reproduce the issue, but we haven't seen it
again since my email at the root of this thread.
Did you see the issue on other HSW machine without this NIC? Also, Thimo, have 
you tried to pin the vcpu and stop irqbalance in dom0?
We do not have any Haswell hardware without this NIC.

~Andrew



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.