[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)
Xiantao, I'm sorry I forgot to mention that I did apply your two patches but it didn't have any effect (interrupts still lost after changing smp_affinity and "No handler for irq vector" message). I added a dprintk in msi_set_mask_bit() and realized that MSI does not have a mask bit (MSIX does). My PCI device uses MSI not MSIX. I placed my dprintk inside the condition below and it never triggered. switch (entry->msi_attrib.type) { case PCI_CAP_ID_MSI: if (entry->msi_attrib.maskbit) { While debugging this problem, I thought about the potential problem of an interrupt firing between the writes for the MSI message address and MSI message data. I noticed that pci_conf_write() uses spin_lock_irqsave() to disable interrupts before issuing the "out" instruction but the writes for the address and data are two separate pci_conf_write() calls. To me, it would be safer to write the address and data in a single call and preceded by spin_lock_irqsave(). This way, when the interrupts are enabled, the address and data have both been updated. Dante -----Original Message----- From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] Sent: Thursday, October 22, 2009 2:42 AM To: Zhang, Xiantao; Jan Beulich Cc: He, Qing; xen-devel@xxxxxxxxxxxxxxxxxxx; Cinco, Dante Subject: Re: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem) On 22/10/2009 09:41, "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx> wrote: >> Hmm, then I don't understand which case your patch was a fix for: I >> understood that it addresses an issue when the affinity of an >> interrupt gets changed (requiring a re-write of the address/data >> pair). If the hypervisor can deal with it without masking, then why >> did you add it? > > Hmm, sorry, seems I misunderstood your question. If the msi doesn't > support mask bit(clearing MSI enable bit doesn't help in this case), > the issue may still exist. Just checked Linux side, seems it doesn't > perform mask operation when program MSI, but don't know why Linux > hasn't such issues. Actaully, we do see inconsisten interrupt message > from the device without this patch, and after applying the patch, the > issue is gone. May need further investigation why Linux doesn't need the > mask operation. Linux is quite careful about when it will reprogram vector/affinity info isn't it? Doesn't it mark such an update pending and only flush it through during next interrupt delivery, or something like that? Do we need some of the upstream Linux patches for this? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |