|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RFC] xen: if on Xen, "flatten" the scheduling domain hierarchy
On 09/23/2015 10:30 AM, Dario Faggioli wrote: On Wed, 2015-09-23 at 06:36 +0200, Juergen Gross wrote:On 09/22/2015 06:22 PM, George Dunlap wrote:Juergen / Dario, could one of you summarize your two approaches, and the (alleged) advantages and disadvantages of each one?Okay, I'll have a try:Thanks for this! ;-)The problem we want to solve: ----------------------------- The Linux kernel is gathering cpu topology data during boot via the CPUID instruction on each processor coming online. This data is primarily used in the scheduler to decide to which cpu a thread should be migrated when this seems to be necessary. There are other users of the topology information in the kernel (e.g. some drivers try to do optimizations like core-specific queues/lists). When started in a virtualized environment the obtained data is next to useless or even wrong, as it is reflecting only the status of the time of booting the system. Scheduling of the (v)cpus done by the hypervisor is changing the topology beneath the feet of the Linux kernel without reflecting this in the gathered topology information. So any decisions taken based on that data will be clueless and possibly just wrong.Exactly.The minimal solution is to change the topology data in the kernel in a way that all cpus are regarded as equal regarding their relation to each other (e.g. when migrating a thread to another cpu no cpu is preferred as a target). The topology information of the CPUID instruction is, however, even accessible form user mode and might be used for licensing purposes of any user program (e.g. by limiting the software to run on a specific number of cores or sockets). So just mangling the data returned by CPUID in the hypervisor seems not to be a general solution, while we might want to do it at least optionally in the future.Yep. It turned out that, although being what started all this, CPUID handling is a somewhat related but mostly independent problem. :-)In the future we might want to support either dynamic topology updates or be able to tell the kernel to use some of the topology data, e.g. when pinning vcpus.Indeed. At least for the latter. Dynamic looks really difficult to me, but indeed it would be ideal. Let's see.Solution 1 (Dario): ------------------- Don't use the CPUID derived topology information in the Linux scheduler, but let it use a simple "flat" topology by setting own scheduler domain data under Xen. Advantages: + very clean solution regarding the scheduler interfaceYes, this is, I think, one of the main advantages of the patch. The scheduler is offering an interface to architectures to define their topology requirements and I'm using it, for specifying our topology requirements: the tool for the job. :-D+ scheduler decisions are based on a minimal data set + small patch Disadvantages: - covers the scheduler only, drivers still use the "wrong" dataThis is a good point. It was the patch's purpose, TBH, but it's certainly true that, if we need something similar elsewhere, we need to do more.
What would you do for keeping the topology information of one level,
e.g. hyperthreads, in case we'd have a gang-scheduler in Xen? Either
you would copy the line:
{ cpu_smt_mask, cpu_smt_flags, SD_INIT_NAME(SMT) },
from kernel/sched/core.c into your topology array, or you would add a
way in kernel/sched/core.c to remove all but this entry and add your
entry on top of it.
In the case mentioned above I just wouldn't zap the topology_sibling_cpumask in my patch.
Thanks. I wanted to look at this as soon as we've decided which way to go. I had some discussion with a kvm guy last week and he seemed not to be convinced they need something else as mangling CPUID (what they already do). Thanks and Regards, Dario PS. BTW, Juergen, you're not on IRC, on #xendevel, are you? I'd like to, but I'd need an invitation. My user name is juergen_gross. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |