[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] schedulers and topology exposing questions
On 27/01/16 15:27, Konrad Rzeszutek Wilk wrote: > On Wed, Jan 27, 2016 at 03:10:01PM +0000, George Dunlap wrote: >> On 27/01/16 14:33, Konrad Rzeszutek Wilk wrote: >>> On Xen - the schedule() would go HLT.. and then later be woken up by the >>> VIRQ_TIMER. And since the two applications were on seperate CPUs - the >>> single packet would just stick in the queue until the VIRQ_TIMER arrived. >> >> I'm not sure I understand the situation right, but it sounds a bit like >> what you're seeing is just a quirk of the fact that Linux doesn't always >> send IPIs to wake other processes up (either by design or by accident), > > It does and it does not :-) > >> but relies on scheduling timers to check for work to do. Presumably > > It .. I am not explaining it well. The Linux kernel scheduler when > called for 'schedule' (from the UDP sendmsg) would either pick the next > appliction and do a context swap - of if there were none - go to sleep. > [Kind of - it also may do an IPI to the other CPU if requested ,but that > requires > some hints from underlaying layers] > Since there were only two apps on the runqueue - udp sender and udp receiver > it would run them back-to back (this is on baremetal) I think I understand at a high level from your description what's happening (No IPIs -> happens to run if on the same cpu, waits until next timer tick if on a different cpu); but what I don't quite get is *why* Linux doesn't send an IPI. It's been quite a while since I looked at the Linux scheduling code, so I'm trying to understand it based a lot on the Xen code. In Xen a vcpu can be "runnable" (has something to do) and "blocked" (waiting for something to do). Whenever a vcpu goes from "blocked" to "runnable", the scheduler will call vcpu_wake(), which sends an IPI to the appropriate pcpu to get it to run the vcpu. What you're describing is a situation where a process is blocked (either in 'listen' or 'read'), and another process does something which should cause it to become 'runnable' (sends it a UDP message). If anyone happens to run the scheduler on its cpu, it will run; but no proactive actions are taken to wake it up (i.e., sending an IPI). The idea of not sending an IPI when a process goes from "waiting for something to do" to "has something to do" seems strange to me; and if it wasn't a mistake, then my only guess why they would choose to do that would be to reduce IPI traffic on large systems. But whether it's a mistake or on purpose, it's a Linux thing, so... >> they knew that low performance on ping-pong workloads might be a >> possibility when they wrote the code that way; I don't see a reason why >> we should try to work around that in Xen. > > Which is not what I am suggesting. I'm glad we agree on this. :-) > Our first ideas was that since this is a Linux kernel schduler characteristic > - let us give the guest all the information it needs to do this. That is > make it look as baremetal as possible - and that is where the vCPU > pinning and the exposing of SMT information came about. That (Elena > pls correct me if I am wrong) did indeed show that the guest was doing > what we expected. > > But naturally that requires pinning and all that - and while it is a useful > case for those that have the vCPUs to spare and can do it - that is not > a general use-case. > > So Elena started looking at the CPU bound and seeing how Xen behaves then > and if we can improve the floating situation as she saw some abnormal > behavious. OK -- if the focus was on the two cases where the Xen credit1 scheduler (apparently) co-located two cpu-burning vcpus on sibling threads, then yeah, that's behavior we should probably try to get to the bottom of. -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |