[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Enabling VT-d PI by default

On 15/05/17 11:27, George Dunlap wrote:
> On Fri, May 12, 2017 at 12:05 PM, Andrew Cooper
> <andrew.cooper3@xxxxxxxxxx> wrote:
>> Citrix Netscalar SDX boxes have more MSI-X interrupts than fit in the
>> cumulative IDTs of a top end dual-socket Xeon server systems.  Some of
>> the device drivers are purposefully modelled to use fewer interrupts
>> than they otherwise would want to.
>> Using PI is the proper solution longterm, because doing so would remove
>> any need to allocate IDT vectors for the interrupts; the IOMMU could be
>> programmed to dump device vectors straight into the PI block without
>> them ever going through Xen's IDT.
> I wouldn't necessarily call that a "proper" solution. With PI, instead
> of an interrupt telling you exactly which VM to wake up and/or which
> routine you need to run, instead you have to search through
> (potentially) thousands of entries to see which vcpu the interrupt you
> received wanted to wake up; and you need to do that on every single
> interrupt.  (Obviously it does have the advantage that if the vcpu
> happens to be running Xen doesn't get an interrupt at all.)

Having spoken to the PI architects, this is not how the technology was
designed to be used.

On systems with this number of in-flight interrupts, trying to track
"who got what interrupt" for priority boosting purposes is a waste of
time, as we spend ages taking vmexits to process interrupt notifications
for out-of-context vcpus.

The way the PI architects envisaged the technology being used is that
Suppress Notification is set at all points other than executing in
non-root mode for the vcpu in question (there is a small race window
around clearing SN on vmentry), and that the scheduler uses Outstanding
Notification on each of the PI blocks when it rebalances credit to see
which vcpus have had interrupts in the last 30ms.

This current behaviour of leaving SN clear until an interrupt arrives is
devastating for performance, especially in combination with the 3-step
mechanism Xen uses to rewrite the interrupt source information, which
pretty much guarantees that interrupts arrive on the wrong pcpu (unless
strict pinning is in effect).


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.