[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN
Thanks,
I’ll have to look at the patches regarding the per-domain pirq changes. That sounds like it probably makes sense, but I seem to remember there were big changes to the irq architecture and irq naming in the hypervisor in previous iterations of these patches, which I didn’t understand.
This IRQ storm issue still needs properly resolving. Noone has yet explained how a message-based interrupt source can cause an irq storm. Storms are inherently a property of level-triggered sources, where ACK/EOI immediately causes re-sampling of the interrupt line and re-assertion of the interrupt at the CPU. How can anything similar happen with MSI? You (Intel) are probably uniquely placed to answer this question, since you manufacture the chipset and NIC which exhibit this problem.
-- Keir
On 27/3/08 06:55, "Shan, Haitao" <haitao.shan@xxxxxxxxx> wrote:
The basic idea including:
1) Keep vector global resource owned by xen, while split pirq into per-domain information.
2) Domain0 kernel will operate msi resource for domain0/domU, while QEMU will operate MSI resource for HVM domain.
3) Xen will do EOI for MSI interrupt.
Signed-off-by: Yunhong Jiang <yunhong.jiang@xxxxxxxxx <mailto:yunhong.jiang@xxxxxxxxx> >
There are no much changes made compared with the original patches. But there do have some issues that we need your kind comments.
1> ACK-NEW method is necessary to avoid IRQ storm. But it causes the deadlock.
During my tests, I do find there can be deadlock with patches applied. When assigned a NIC device to HVM domain, the scenario is: Dom0 is waiting to IDE interrupt (vector 0x21); HVM domain is waiting for qemu’s IDE emulation and thus blocked; NIC interrupt (MSI vector 0x31) is waiting for injection to HVM domain since it is blocked now; IDE interrupt is waiting for NIC interrupt since NIC interrupt is of high priority but not ACKed by XEN now. When IDE interrupt and NIC interrupt are delivered to the same CPU, and when guest OS is Vista, the phenomenon is easy to be observed.
2> Without ACK-NEW, some naughty NIC devices as we observed will bring IRQ storms. For this phenomenon, I think Yunhong can comment more. Basically, writing EOI without mask the source of MSI will bring IRQ storm. Although the reason is under investigation, XEN should anyhow handle such bogous device, right?
3> Using ACK-OLD and masking the MSI when writing EOI can be solution. However, XEN does not own PCI configuration spaces.
We also tried some work arounds.
One work around might be using a timer to force a EOI within some time interval. This method is already implemented in VT-D’s code. However, with this approach, if the timer is fired and EOI is written, this is essentially the same apporach as option 2.
Another approach is to never deliver these two IRQs to the same CPU. But this is really ugly and can not be applied to UP.
We have also considered using VT-D 2 interrupt remapping feature. According to the spec, there is no bit in the remapping table to mask the interrupt. Therefore, this can not be combined with option 2 to solve the issue. Masking the interrupt still needs accessing PCI configuration spaces.
We think the most clean method may be to move ownership from dom0 to VMM. However, this is a great change. This should be well discussed in community and need your comments.
These patch series sent out can be served as a discussion materials. What is your comments on the patches and the issues, Keir?
Thanks!
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|