[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v1 1/4] xen: add real time scheduler rt



On mer, 2014-09-03 at 17:06 +0100, George Dunlap wrote:
> On Wed, Sep 3, 2014 at 4:13 PM, Meng Xu <xumengpanda@xxxxxxxxx> wrote:
> > Hi George,
> >
> >
> > 2014-09-03 10:20 GMT-04:00 George Dunlap <George.Dunlap@xxxxxxxxxxxxx>:
> >
> >> On Sun, Aug 24, 2014 at 11:58 PM, Meng Xu <mengxu@xxxxxxxxxxxxx> wrote:
> >> > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> >> > index 5b11bbf..27d01c1 100644
> >> > --- a/xen/include/public/domctl.h
> >> > +++ b/xen/include/public/domctl.h
> >> > @@ -339,6 +339,19 @@ struct xen_domctl_max_vcpus {
> >> >  typedef struct xen_domctl_max_vcpus xen_domctl_max_vcpus_t;
> >> >  DEFINE_XEN_GUEST_HANDLE(xen_domctl_max_vcpus_t);
> >> >
> >> > +/*
> >> > + * This structure is used to pass to rt scheduler from a
> >> > + * privileged domain to Xen
> >> > + */
> >> > +struct xen_domctl_sched_rt_params {
> >> > +    /* get vcpus' info */
> >> > +    uint64_t period; /* s_time_t type */
> >> > +    uint64_t budget;
> >> > +    uint16_t index;
> >> > +    uint16_t padding[3];
> >>
> >> Why the padding?
> >>
> >
> > I did this because of Jan's comment "Also, you need to pad the structure to
> > a multiple of 8 bytes, or
> > its layout will differ between 32- and 64-bit (tool stack) callers." I think
> > what he said make sense so I added the padding here. :-)
> >
> > Here is the link: http://marc.info/?l=xen-devel&m=140661680931179&w=2
> 
> Right. :-)  I personally prefer to handle that by re-arranging the
> elements rather than adding padding, unless absolutely necessary.  In
> this case that shouldn't be too hard, particularly once we pare the
> interface down so we only have one interface (either all one vcpu at a
> time, or all batched vcpus).
> 
> > I think it's a better idea to
> >  pass in an array with information about vcpus to get/set vcpus'
> > information.
> >
> > I only need to change the code related to setting a vcpu's information.
> > I have a question:
> > When we set a vcpu's information by using an array, we have two choices:
> >
> > a) just create an array with one vcpu element, and specify the index of the
> > vcpu to modify; The concern to this method is that we only uses one element
> > of this array, so is it a good idea to use an array with only one element?
> > b) create an array with all vcpus of this domain, modify the parameters of
> > the vcpu users want to change, and then bounce the array to hypervisor to
> > reset these vcpus' parameters. The concern to this method is that we don't
> > need any other vcpus' information to set a specific vcpu's parameters.
> > Bouncing the whole array with all vcpus information seems expensive and
> > unnecessary?
> >
> > Do you have any suggestion/advice/preference on this?
> >
> > I don't really like about the idea of reading the vcpu's information
> > one-by-one. :-) If a domain has many vcpus, say 12 vcpus, we will issue 12
> > hypercalls to get all vcpus' information of this domain. Because we only
> > need to issue one hypercall to get all information we want, the extra
> > hypercalls causes more overhead. This did simplify the implementation, but
> > may cause more overhead.
> 
> For convenience for users, I think it's definitely the case that libxl
> should provide an interface to get and set all the vcpu parameters at
> once.  Then it can either batch them all into a single hypercall (if
> that's what we decide), or it can make the individual calls for each
> vcpu.
> 
Indeed.

> The main reason I would think to batch the hypercalls is for
> consistency: it seems like you may want to change the period / budget
> of vcpus atomically, rather than setting one, possibly having dom0
> de-scheduled for a few hundred milliseconds, and then setting another.
> Same thing goes for reading: I would think you would want a consistent
> "snapshot" of some existing state, rather than having the possibility
> of reading half the state, then having someone change it, and then
> reading the other half.
> 
That is actually the reason why I'd have both things. A "change this
one" variant is handy if one actually had to change only one vcpu, or a
few, but does not mind the non-atomicity.

The batched variant, for both overhead and atomicity reasons.

> Re the per-vcpu settings, though: Is it really that common for RT
> domains to want different parameters for different vcpus?  
>
Whether it's common it is hard to say, but yes, it has to be possible. 

For instance, I can put, in an SMP guest, two real-time applications
with different timing requirements, and pin each one to a different
(v)cpu (I mean pin *inside* the guest). At this point, I'd like for each
vcpu to have a set of RT scheduling parameters, at the Xen level, that
matches the timing requirements of what's running inside.

This may not look so typical in a server/cloud environment, but can
happen (at least in my experience) in a mobile/embedded env.

> Are these
> parameters exposed to the guest in any way, so that it can make more
> reasonable decisions as to where to run what kinds of workloads?
> 
Not right now, AFAICS, but forms of 'scheduling paravirtualization', or
in general this kind of interaction/communication could be very useful
in real-time virtualization, so we may want to support that in future.

In any case, even without that in place right now, I think different
parameters for different vcpus is certainly something we want from an RT
scheduler.

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.