[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)
Dante, If the device doesn't support MSI mask bit, the second patch should have no effect for that. And I am working on backporting more IRQ migration logic from Linux, and it should ensure addr/vector are both written to devices before firing new interrrupts. But as I mentioned before, if you want to solve the guest affinity setting issue, you have to apply the first patch I sent out (fix-irq-affinity-msi3.patch). :-) Xiantao Cinco, Dante wrote: > Xiantao, > > I'm sorry I forgot to mention that I did apply your two patches but > it didn't have any effect (interrupts still lost after changing > smp_affinity and "No handler for irq vector" message). I added a > dprintk in msi_set_mask_bit() and realized that MSI does not have a > mask bit (MSIX does). My PCI device uses MSI not MSIX. I placed my > dprintk inside the condition below and it never triggered. > > switch (entry->msi_attrib.type) { > case PCI_CAP_ID_MSI: > if (entry->msi_attrib.maskbit) { > > While debugging this problem, I thought about the potential problem > of an interrupt firing between the writes for the MSI message address > and MSI message data. I noticed that pci_conf_write() uses > spin_lock_irqsave() to disable interrupts before issuing the "out" > instruction but the writes for the address and data are two separate > pci_conf_write() calls. To me, it would be safer to write the address > and data in a single call and preceded by spin_lock_irqsave(). This > way, when the interrupts are enabled, the address and data have both > been updated. > > Dante > > -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] > Sent: Thursday, October 22, 2009 2:42 AM > To: Zhang, Xiantao; Jan Beulich > Cc: He, Qing; xen-devel@xxxxxxxxxxxxxxxxxxx; Cinco, Dante > Subject: Re: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem) > > On 22/10/2009 09:41, "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx> wrote: > >>> Hmm, then I don't understand which case your patch was a fix for: I >>> understood that it addresses an issue when the affinity of an >>> interrupt gets changed (requiring a re-write of the address/data >>> pair). If the hypervisor can deal with it without masking, then why >>> did you add it? >> >> Hmm, sorry, seems I misunderstood your question. If the msi doesn't >> support mask bit(clearing MSI enable bit doesn't help in this case), >> the issue may still exist. Just checked Linux side, seems it doesn't >> perform mask operation when program MSI, but don't know why Linux >> hasn't such issues. Actaully, we do see inconsisten interrupt >> message >> from the device without this patch, and after applying the patch, the >> issue is gone. May need further investigation why Linux doesn't >> need the mask operation. > > Linux is quite careful about when it will reprogram vector/affinity > info isn't it? Doesn't it mark such an update pending and only flush > it through during next interrupt delivery, or something like that? Do > we need some of the upstream Linux patches for this? > > -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |