Xen project Mailing List

Re: [Xen-devel] Enabling VT-d PI by default

To: George Dunlap <george.dunlap@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

From: Dario Faggioli <dario.faggioli@xxxxxxxxxx>

Date: Tue, 16 May 2017 13:52:38 +0200

Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Chao Gao <chao.gao@xxxxxxxxx>

Delivery-date: Tue, 16 May 2017 11:53:00 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Mon, 2017-05-15 at 15:32 +0100, George Dunlap wrote: > On Mon, May 15, 2017 at 2:35 PM, Andrew Cooper > <andrew.cooper3@xxxxxxxxxx> wrote: > > On systems with this number of in-flight interrupts, trying to > > track > > "who got what interrupt" for priority > > boosting purposes is a waste of > > time, as we spend ages taking vmexits to process interrupt > > notifications > > for out-of-context vcpus. > > > > The way the PI architects envisaged the technology being used is > > that > > Suppress Notification is set at all points other than executing in > > non-root mode for the vcpu in question (there is a small race > > window > > around clearing SN on vmentry), and that the scheduler uses > > Outstanding > > Notification on each of the PI blocks when it rebalances credit to > > see > > which vcpus have had interrupts in the last 30ms. > > It sounds like they may have made the mistake that the Credit1 > designers made, in analyzing only a system that was overloaded; and > one where all workloads were identical, as opposed to analyzing a > system that was at least sometimes partially loaded, and where > workloads were very different. > Totally agree. Also, I'm not sure I follow why PI architects would be basing hardware design on specific characteristics of a particular Xen scheduler. E.g., in Linux --which I'd think they also had in mind when envisioning uses of the technology-- there is no such thing as 30ms timeslice, nor credits redistribution. And AFAICU what you seem to suggest, not notifying an interrupt/not waking up anyone, at the time at which it happens, means there must be some kind of list_for_each_vcpu() anyway, for checking which vCPUs have pending notifications. Hence the problem we're discussing here, would just be moved between subsystems, rather than going away. And, finally, I don't get what you mean when you say that we're trying to use PI "for priority boosting purposes". I don't think we do that. FTR, I've quickly checked how this is done in Linux, and the solution pushed there looks really similar to the one that has been pushed to Xen as well. E.g., the also there, the handler scans the blocked vCPUs list: http://elixir.free-electrons.com/linux/latest/source/arch/x86/kvm/vmx.c#L6464 > In both cases, waiting 30ms to see if we should wake somebody up is > far too long. > Absoluely! Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.