[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Notes on stubdoms and latency on ARM

On Fri, 2017-05-26 at 13:09 -0700, Volodymyr Babchuk wrote:
> Hello Dario,

> > Feel free to ask anything. :-)
> I'm so unfamiliar, so even don't know what to ask :) But thank you.
> Surely I'll have questions.
Sure. As soon as you have one, go ahead with it.

> > The null scheduler is meant at being useful when you have a static
> > scenario, no (or very few) overbooking (i.e., total nr of vCPUs ~=
> > nr
> > of pCPUS), and what to cut to _zero_ the scheduling overhead.
> > 
> > That may include certain class of real-time workloads, but it not
> > limited to such use case.
> Can't I achieve the same with any other scheduler by pining one vcpu
> to one pcpu?
Of course you can, but not with the same (small!!) level of overhead of
the null scheduler. In fact, even if you do 1-to-1 pinning of all the
vcpus, the general purpose scheduler (like Credit1 and Credit2) can't
rely on assumptions that something like that is indeed in effect, and
that it will always be.

For instance, if you have all vcpus except one pinned to 1 pCPU. That
one missing vcpu, in its turn, can run everywhere. The scheduler has to
always go and see which vcpu is the one that is free to run everywhere,
and whether it should (for instance) preempt any (and, if yes, which)
of the pinned ones.

Also, still in those scheduler, there may be multiple vcpus that are
pinned to the same pCPU. In which case, the scheduler, at each
scheduling decision, needs to figure out which ones (among all the
vcpus) they are, and which one has the right to run on the pCPU.

And, unfortunately, since pinning can change 100% asynchronously wrt
the scheduler, it's really not possible to either make assumptions, nor
even to try to capture some (special case) situation in a data

Therefore, yes, if you configure 1-to-1 pinning in Credit1 or Credit2,
the actual schedule would be the same. But that will be achieve with
almost the same computational overhead, as if the vcpus were free.

OTOH, the null scheduler is specifically designed for the (semi-)static 
1-to-1 pinning use case, so the overhead it introduces (for making
scheduling decisions) is close to zero.

> > > Do you have any tools to profile or trace XEN core? Also, I don't
> > > think that pure context switch time is the biggest issue. Even
> > > now,
> > > it
> > > allows 180 000 switches per second (if I'm not wrong). I think,
> > > scheduling latency is more important.
> > > 
> > 
> > What do you refer to when you say 'scheduling latency'? As in, the
> > latency between which events, happening on which component?
> I'm worried about interval between task switching events.
> For example: vcpu1 is vcpu of some domU and vcpu2 is vcpu of stubdom
> that runs device emulator for domU.
> vcpu1 issues MMIO access that should be handled by vcpu2 and gets
> blocked by hypervisor. Then there will be two context switches:
> vcpu1->vcpu2 to emulate that MMIO access and vcpu2->vcpu1 to continue
> work. AFAIK, credit2 does not guarantee that vcpu2 will be scheduled
> right after when vcpu1 will be blocked. It can schedule some vcpu3,
> then vcpu4 and only then come back to vcpu2.  That time interval
> between event "vcpu2 was made runable" and event "vcpu2 was scheduled
> on pcpu" is what I call 'scheduling latency'.
Yes, currently, that's true. Basically, from the scheduling point of
view, there's no particular relationship between a domain's vcpu, and
the vcpu of the driver/stub-dom that service the domain itself.

But there's a plan to change that, as both I and Stefano said already,
and do something in all schedulers. We'll just start with null, because
it's the easiest. :-)

> This latency can be minimized by mechanism similar to priority
> inheritance: if scheduler knows that vcpu1 waits for vcpu2 and there
> are remaining time slice for vcpu1 it should select vcpu2 as next
> scheduled vcpu. Problem is how to populate such dependencies.
I've spent my PhD studying and doing stuff around priority
inheritance... so something similar to that, is exactly what I had in
mind. :-D

<<This happens because I choose it to happen!>> (Raistlin Majere)
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.