[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC v2 1/4] xen: add real time scheduler rt



On 02/08/14 16:31, Meng Xu wrote:

> +/*
> + * Init/Free related code
> + */
> +static int
> +rt_init(struct scheduler *ops)
> +{
> + Â Âstruct rt_private *prv = xzalloc(struct rt_private);
> +
> + Â Âif ( prv == NULL )
> + Â Â Â Âreturn -ENOMEM;

Newline in here.

âChanged, thanks!
â

> + Â Âops->sched_data = prv;

Is it safe to set ops->sched_data with a half constructed rt_private? ÂI
suspect this wants to be the very last (non-debug) action in this function.

âI think it should be fine. I double checked the _init function in sched_credit2.c. It has the similar operation: first assign prv to ops->sched_data and then set the value of prv.Â
Of course, I can switch âthem. But I'm not sure if that really matter. :-)

It means that anyone else who peaks at ops->sched_data will see a partially constructed rt_private object. This can be a source of subtle bugs, so it is better to code appropriately, rather than relying on an assumption that noone would go peaking before this function returns.


> +static void *
> +rt_alloc_domdata(const struct scheduler *ops, struct domain *dom)
> +{
> + Â Âunsigned long flags;
> + Â Âstruct rt_dom *sdom;
> + Â Âstruct rt_private * prv = RT_PRIV(ops);
> +
> + Â Âprinttime();
> + Â Âprintk("dom=%d\n", dom->domain_id);
> +
> + Â Âsdom = xzalloc(struct rt_dom);
> + Â Âif ( sdom == NULL )
> + Â Â{
> + Â Â Â Âprintk("%s, xzalloc failed\n", __func__);
> + Â Â Â Âreturn NULL;
> + Â Â}
> +
> + Â ÂINIT_LIST_HEAD(&sdom->vcpu);
> + Â ÂINIT_LIST_HEAD(&sdom->sdom_elem);
> + Â Âsdom->dom = dom;
> +
> + Â Â/* spinlock here to insert the dom */
> + Â Âspin_lock_irqsave(&prv->lock, flags);
> + Â Âlist_add_tail(&sdom->sdom_elem, &(prv->sdom));
> + Â Âspin_unlock_irqrestore(&prv->lock, flags);
> +
> + Â Âreturn (void *)sdom;

Bogus cast.

âI think we have to cast it to void * âbecause the definition of this function asks the return type to be void *. In addition, the credit2 scheduler also did the same cast in Âthis _alloc_domdata function. So I guess this should be fine?


In C, all pointers are implicitly castable to/from void*. This is not the case in C++. (which suggests to me that George learnt C++ before C, or is at least more familiar with C++?)

The act of using type punning means that you are trying to do something which the C type system doesn't like. This in turn requires more thought from people reading the code to work out whether it is actually safe or not. As a result, we strive for as few type overrides as possible.

Please don't fall into the trap of assuming the credit2 code, or indeed any of the other code in Xen, is perfect. None of it is, although most of it is good.

Â
> +/*
> + * set/get each vcpu info of each domain
> + */
> +static int
> +rt_dom_cntl(
> + Â Âconst struct scheduler *ops,
> + Â Âstruct domain *d,
> + Â Âstruct xen_domctl_scheduler_op *op)
> +{
> + Â Âxen_domctl_sched_rt_params_t *local_sched;
> + Â Âstruct rt_dom * const sdom = RT_DOM(d);
> + Â Âstruct list_head *iter;
> + Â Âint vcpu_index = 0;
> + Â Âint rc = -EINVAL;
> +
> + Â Âswitch ( op->cmd )
> + Â Â{
> + Â Âcase XEN_DOMCTL_SCHEDOP_getnumvcpus:
> + Â Â Â Âop->u.rt.nr_vcpus = 0;
> + Â Â Â Âlist_for_each( iter, &sdom->vcpu )
> + Â Â Â Â Â Âvcpu_index++;
> + Â Â Â Âop->u.rt.nr_vcpus = vcpu_index;
> + Â Â Â Ârc = 0;
> + Â Â Â Âbreak;
> + Â Âcase XEN_DOMCTL_SCHEDOP_getinfo:
> + Â Â Â Â/* for debug use, whenever adjust Dom0 parameter, do global dump */
> + Â Â Â Âif ( d->domain_id == 0 )
> + Â Â Â Â Â Ârt_dump(ops);
> +
> + Â Â Â Âop->u.rt.nr_vcpus = 0;
> + Â Â Â Âlist_for_each( iter, &sdom->vcpu )
> + Â Â Â Â Â Âvcpu_index++;
> + Â Â Â Âop->u.rt.nr_vcpus = vcpu_index;
> + Â Â Â Âlocal_sched = xzalloc_array(xen_domctl_sched_rt_params_t, vcpu_index);
> + Â Â Â Âvcpu_index = 0;
> + Â Â Â Âlist_for_each( iter, &sdom->vcpu )
> + Â Â Â Â{
> + Â Â Â Â Â Âstruct rt_vcpu * svc = list_entry(iter, struct rt_vcpu, sdom_elem);
> +
> + Â Â Â Â Â Âlocal_sched[vcpu_index].budget = svc->budget;
> + Â Â Â Â Â Âlocal_sched[vcpu_index].period = svc->period;
> + Â Â Â Â Â Âlocal_sched[vcpu_index].index = vcpu_index;
> + Â Â Â Â Â Âvcpu_index++;
> + Â Â Â Â}
> + Â Â Â Âcopy_to_guest(op->u.rt.vcpu, local_sched, vcpu_index);

This will clobber guest heap if vcpu_index is greater than the allocated
space.
Â
âThis is a good point! I will pass the size of the array to the kernel and check that the number of the array's elements is not smaller than the number of vcpus.
Â
ÂYou also unconditionally leak local_sched, but there is no need
for an allocation in the first place.

âI will add the xfree() after copy_to_guest().Â
I have a question: how can I avoid allocating the local_sched?

unsigned i = 0;

list_for_each( iter, &sdom->vcpu )
{
ÂÂÂ xen_domctl_sched_rt_params_t local_sched;

ÂÂÂ if ( i >= op->u.rt.nr_vcpus )
ÂÂÂÂÂÂÂ break;

ÂÂÂ /* Fill local_sched */

ÂÂÂ if ( copy_to_guest_offset(op->u.rt.vcpu, i, &local_sched, 1) )
ÂÂÂ {
ÂÂÂ Â Â ret = -EFAULT;
ÂÂÂÂ ÂÂ break;
ÂÂÂ }
}



Â
> + Â Â Â Ârc = 0;
> + Â Â Â Âbreak;
> + Â Âcase XEN_DOMCTL_SCHEDOP_putinfo:
> + Â Â Â Âlist_for_each( iter, &sdom->vcpu )
> + Â Â Â Â{
> + Â Â Â Â Â Âstruct rt_vcpu * svc = list_entry(iter, struct rt_vcpu, sdom_elem);
> +
> + Â Â Â Â Â Â/* adjust per VCPU parameter */
> + Â Â Â Â Â Âif ( op->u.rt.vcpu_index == svc->vcpu->vcpu_id )
> + Â Â Â Â Â Â{
> + Â Â Â Â Â Â Â Âvcpu_index = op->u.rt.vcpu_index;
> +
> + Â Â Â Â Â Â Â Âif ( vcpu_index < 0 )
> + Â Â Â Â Â Â Â Â Â Âprintk("XEN_DOMCTL_SCHEDOP_putinfo: vcpu_index=%d\n",
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Âvcpu_index);
> + Â Â Â Â Â Â Â Âelse
> + Â Â Â Â Â Â Â Â Â Âprintk("XEN_DOMCTL_SCHEDOP_putinfo: "
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â"vcpu_index=%d, period=%"PRId64", budget=%"PRId64"\n",
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Âvcpu_index, op->u.rt.period, op->u.rt.budget);
> +
> + Â Â Â Â Â Â Â Âsvc->period = op->u.rt.period;
> + Â Â Â Â Â Â Â Âsvc->budget = op->u.rt.budget;
> +
> + Â Â Â Â Â Â Â Âbreak;
> + Â Â Â Â Â Â}
> + Â Â Â Â}
> + Â Â Â Ârc = 0;
> + Â Â Â Âbreak;
> + Â Â}
> +
> + Â Âreturn rc;
> +}
> +
> +static struct rt_private _rt_priv;
> +
> +const struct scheduler sched_rt_def = {
> +  Â.name      = "SMP RT Scheduler",
> +  Â.opt_name    = "rt",

Should these names reflect RT_DS as opposed to simply RT?

âDS (Deferrable Server) is just one kind of server mechanisms for global Earliest Deadline First scheduling. âWe can add other server mechanisms in the same file sched_rt.c to extend this Real Time scheduler. But we don't want to change/affect user's interface when we add more server mechanisms.
The .opt_name will affect the user's interface when user choose the rt scheduler, If we change it to rt_ds, we will have to change it to rt again when we have more server mechanisms implemented. Then users will have to change their configuration (i.e., the command line value sched=) to the new name rt. Because this could potentially affect users' interface, I think it's better to use rt here. What do you think?


Reusing the rt.c infrastructure is fine, but something other than DS will necessarily be a different scheduler as far as the tools are concerned. The user should expect to have to change their parameters if they want to use a new scheduler. The other point of view is that the user should expect their scheduler not to change under their feet if they continue using the same parameters as before.

Â

> +  Â.sched_id    = XEN_SCHEDULER_RT_DS,
> +  Â.sched_data   = &_rt_priv,
> +
> + Â Â.dump_cpu_state = rt_dump_pcpu,
> + Â Â.dump_settings Â= rt_dump,
> +  Â.init      = rt_init,
> +  Â.deinit     = rt_deinit,
> +  Â.alloc_pdata  Â= rt_alloc_pdata,
> +  Â.free_pdata   = rt_free_pdata,
> + Â Â.alloc_domdata Â= rt_alloc_domdata,
> +  Â.free_domdata  = rt_free_domdata,
> +  Â.init_domain  Â= rt_dom_init,
> + Â Â.destroy_domain = rt_dom_destroy,
> +  Â.alloc_vdata  Â= rt_alloc_vdata,
> +  Â.free_vdata   = rt_free_vdata,
> +  Â.insert_vcpu  Â= rt_vcpu_insert,
> +  Â.remove_vcpu  Â= rt_vcpu_remove,
> +
> +  Â.adjust     = rt_dom_cntl,
> +
> +  Â.pick_cpu    = rt_cpu_pick,
> +  Â.do_schedule  Â= rt_schedule,
> +  Â.sleep     Â= rt_vcpu_sleep,
> +  Â.wake      = rt_vcpu_wake,
> + Â Â.context_saved Â= rt_context_saved,
> +};
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index e9eb0bc..f2df400 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -68,6 +68,7 @@ static const struct scheduler *schedulers[] = {
> Â Â Â&sched_sedf_def,
> Â Â Â&sched_credit_def,
> Â Â Â&sched_credit2_def,
> + Â Â&sched_rt_def,
> Â Â Â&sched_arinc653_def,
> Â};
>
> @@ -1092,7 +1093,8 @@ long sched_adjust(struct domain *d, struct xen_domctl_scheduler_op *op)
>
> Â Â Âif ( (op->sched_id != DOM2OP(d)->sched_id) ||
> Â Â Â Â Â ((op->cmd != XEN_DOMCTL_SCHEDOP_putinfo) &&
> - Â Â Â Â Â(op->cmd != XEN_DOMCTL_SCHEDOP_getinfo)) )
> + Â Â Â Â Â(op->cmd != XEN_DOMCTL_SCHEDOP_getinfo) &&
> + Â Â Â Â Â(op->cmd != XEN_DOMCTL_SCHEDOP_getnumvcpus)) )
> Â Â Â Â Âreturn -EINVAL;
>
> Â Â Â/* NB: the pluggable scheduler code needs to take care
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 5b11bbf..8d4b973 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -339,6 +339,18 @@ struct xen_domctl_max_vcpus {
> Âtypedef struct xen_domctl_max_vcpus xen_domctl_max_vcpus_t;
> ÂDEFINE_XEN_GUEST_HANDLE(xen_domctl_max_vcpus_t);
>
> +/*
> + * This structure is used to pass to rt scheduler from a
> + * privileged domain to Xen
> + */
> +struct xen_domctl_sched_rt_params {
> + Â Â/* get vcpus' info */
> + Â Âint64_t period; /* s_time_t type */
> + Â Âint64_t budget;
> +  Âint   index;

Index is clearly an unsigned quantity. ÂFor alignment and compatibility,
uint64_t would make the most sense. ÂAlternatively, uint32_t and an
explicit uint32_t pad field.

âAgree. I have changed it to uint16_t because the vcpu_index is uint16_t in the tool stack.

Thank you very much for your comments and suggestions! Looking forward to your reply! ;-)

Best,

Mengâ

If you make its size a multiple of 8, then Xen doesn't need a compat shim for converting hypercalls from 32bit guests.

~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.