[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC] [Design] Better XL support of RTDS scheduler for Xen 4.6

On Mon, 2015-02-23 at 22:58 -0500, Meng Xu wrote:
> 2015-02-23 10:57 GMT-05:00 Wei Liu <wei.liu2@xxxxxxxxxx>:

> >> In order to implement the set function, which set the parameters of
> >> one specific VCPU of a domain, we have two different approaches to
> >> implement it:
> >> Choice 1): When users set the parameters of k^th VCPUs of a domain,
> >> which has x VCPUs in total, the toolstack passes "only" the modified
> >> VCPUsâ information from userspace to hypervisor via hypercall;  So
> >
> > A new hypercall or existing hypercall?
> We only need to change the current hypercall
> (XEN_DOMCTL_SCHEDOP_setinfo) implementation:
IMHO, it can be either way (add a new one or change the existing one). 

> >> Choice 2): When users set the parameters of k^th VCPUs of a domain,
> >> which has x VCPUs in total, the toolstack will create an array with
> >> length x and set the information (period, budget) of the k^th element
> >> in the array. The other elements in the array are marked as 0. After
> >> the toolstack bounce the array into the hypervisor, it will only set
> >> the parameters of the k^th VCPU.
> >>
> >
> > One thing to consider is when you have hundreds of cpus the overhead
> > might be too big? What's the amount of data you need to pass down to
> > hypervisor?
> Suppose a domain have k VCPUs, each VCPU has info: period (uint32_t),
> budget(uint32_t), and index(uint32_t); So each hypercall will pass k *
> (3*8Bytes) to the hypervisor.
Maybe I'm missing something but, if we pass the array, why do we need
the index? Oh, and in case we need the index, it can well be smaller
than 32 bits.

For Wei, we're talking about _v_cpus here (in your reply you're just
saying 'cpus', so I'm not sure whether you were referring to _v_ or _p_
cpus). If we have a guest with k > 100 vcpus, it's either:
 - pass a k ( > 100) element array down to Xen
 - issue k ( > 100) hypercalls back to back

OTOH, if we have such a large guest, and we only want to change the
params for one vcpu and we go for batching, we'd always have to pass the
big array, with only one meaningful value.

Allow me to say that the more common use case for this scheduler would
not include guests that big. Also, I expect most of the users wanting to
set the parameters of the various vcpus all together, and, if not once
and for all, at least quite rarely, rather than tweaking the values
every now and then.

Then, of course, users will do the weirdest things, so we really should
consider the overhead, but I thought it would be worth to at least
mention this.

> >> 2) Which design choice should we choose to implement the set function,
> >> Choice 1 or Choice 2? or do you have better suggestion?
> >>
> >
> > I have some general questions. Here in this email you only mention xl. I
> > think in fact you mean xl + libxl? We surely want to enable other
> > toolstacks to control specific parameters of schedulers as well.
> Yes. I need to change xl + libxl + libxc + rt_dom_cntl() in hypervisor. :-)
> I'm not sure if other schedulers need such functionality right now,
They don't, but they may in future... Or new schedulers we will be
introducing in 5 years time may need it.

> because the credit2 scheduler account for the credit based on each
> domain instead of each VCPU. 
Mmm... no, that's not the reason why they don't need this (but let's not
get into this, or we'll end up off topic :-D ).

> But if the scheduler will consider the
> vcpu-level credit/budget accounting, this could be helpful.
Exactly. And while introducing new APIs, we really should think to that.

> > And I presume in the future other schedulers might also want to have
> > some per-vcpu tuning interface. Is it possible to abstract out common
> > interface in hypervisor and libxl to do that? Even if we can't sensible
> > come up with a hypervisor interface to do that, we might still want to
> > do that inside libxl. I.e. to provide some kind of common infrastructure
> > that individual scheduler can hook in.
> Good point! This should be possible. I think maybe we can do this step
> by step: First have a concrete implementation for the RTDS scheduler,
> and then we can extend it to a more general interface.
Well, we of course can do things step by step, but at least the API
needs to be consistent and generic enough from day 0... but that should
not be too difficult, IMO. The big question (at all levels!) is still
whether we want a potentially large array of structs of single structs
leading to a potentially long stream of API-calls and hypercalls.


Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.