[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] xen:rtds:Fix bug in budget accounting



On Fri, 2016-10-21 at 22:11 -0400, Meng Xu wrote:
> Bug scenario:
> While a VCPU is running on a core, it may misses its deadline.
>
May be useful to mention why (at least the most common reasons, like
overhead and/or system being overloaded... are there others?).

> Then repl_timer_handler() will update the VCPU period and deadline.
> The VCPU may have high priority with the new deadline and
> repl_timer_handler()
> decides to keep the VCPU running on the core, but with new period and
> deadline.
>
repl_timer_handler() does not decide what to run. It's probably better
to say something like "If the VCPU is still the highest priority one,
even with the new deadline, it will continue to run"

> However, the budget enforcement timer for the previous period is
> still armed.
> When rt_schedule() is called in the new period to enforce the budget
> for the previous period, current burn_budget() will deduct the time
> spent
> in previous period from the budget in current period, which is
> incorrect.
> 
Ok, so, at time t=10 the replenishment timer triggers. It finds the
vcpu in the following status:

 v->cur_deadline = 10
 v->cur_budget = 3
 v->last_start = 2

I.e., the vcpu still has some budget left, but, e.g., because the
system is overloaded, could not use it in its previous period.

Assuming v->period=10 and v->budget=5, rt_update_deadline(), called
from within repl_timer_handler(), will put the vcpu in the following
state:

 v->curr_deadline = 20
 v->curr_budget = 5
 v->last_start = 2

Then, at t=15 rt_schedule() is invoked, it in turns calls burn_budget()
which does:

 delta = NOW() - v->last_start = 15 - 2 = 13
 v->cur_budget -= 13

Which is too much. Is this what we are talking about?

> Fix:
> We keeps last_start always within the current period for a VCPU, so
> that
> We only deduct the time spent in the current period from the VCPU
> budget.
> 
Well, this is sort of ok, I guess.

In general, I think that cur_deadline, cur_budget and last_start are
tightly related between each others. Especially cur_budget and
last_start, considering that the former is updated in a way that
depends from the value of the latter.

The problem here seems to me to be that, indeed, we don't update
last_start all the time that we update cur_budget. If, for instance,
you look at Credit2, start_time is updated inside burn_credits(), while
in RTDS, last_start is not updated in burn_budget().

So (1), one thing that I would do is to set svc->last_start=now; in
burn_budget().

However, just doing that won't solve the problem, because
repl_timer_handler() does not call burn_budget(). I've wondered a bit
whether that is correct or not... I mean, especially in the situation
we are discussing --when repl_timer_handler() runs "before"
rt_schedule()-- how do we account for the budget spent from the last
invocation of burn_budget() and where in time we are now? Well, I'm not
really sure whether or not this is a problem.

Since we don't allow the budget to go negative, and since we are giving
both new deadline and new budget to the vcpu anyway, it may not be a
big deal, actually. Or maybe it could be an issue, but only if either
the overhead and/or the overload are so big (e.g., if the vcpu is
overrunning and doing that for more than one full period), that the
system is out of the window already.

So, let's not consider this, for the moment... And let's focus on the
fact that, in rt_update_deadline(), we are changing cur_deadline and
cur_budget, so we also need to update last_start (as we said above
they're related!). I guess we either call burn_budget from
rt_update_deadline(), or we open-code svc->last_start=now;

burn_budget() does things that we probably don't need and don't want to
do in rt_update_deadline(). In fact, if we don't allow the budget to go
negative, and ignore the (potential, and maybe non-existent) accounting
issue described above, there's no point in marking the vcpu as depleted
(in burn_budget()) and then (since we're calling it from
rt_update_deadline()) replenishing it right away! :-O

So (2), I guess the best thing to do is "just" to update last_start in
rt_update_deadline() to, which is sort of what the patch is doing.

Yet, there's something I don't understand, i.e.:

> Signed-off-by: Meng Xu <mengxu@xxxxxxxxxxxxx>
> Reported-by: Dagaen Golomb <dgolomb@xxxxxxxxxxxxx>
> 
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -407,6 +407,14 @@ rt_update_deadline(s_time_t now, struct rt_vcpu
> *svc)
>          svc->cur_deadline += count * svc->period;
>      }
>  
> +    /*
> +     * svc may be scheduled to run immediately after it misses
> deadline
> +     * Then rt_update_deadline is called before rt_schedule, which 
> +     * should only deduct the time spent in current period from the
> budget
> +     */
> +    if ( svc->last_start < (svc->cur_deadline - svc->period) )
> +        svc->last_start = svc->cur_deadline - svc->period;
> +
 - Do we need the if()? Isn't it safe to just always update last_start?
   If it is, as it looks to me to be, I'd prefer it done like that.
 - Why cur_deadline-period, and not NOW() (well, actually, now)?
   Well, in the scenario above, and in all the others I can think of,
   it would be almost the same. And yet, I'd prefer using now, for
   clarity and consistency.

In summary, what I think we need is a patch that does both (1) and (2),
i.e., it makes sure that we always update last_start to NOW() all the
times that we assign to a vcpu its new budget. That would make the code
more consistent and should also fix the buggy behavior you're seeing.

What do you think?

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.