[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1/4] xen: credit2: implement utilization cap
On Fri, Jun 23, 2017 at 5:19 PM, Dario Faggioli <dario.faggioli@xxxxxxxxxx> wrote: >> > +{ >> > + struct csched2_dom *sdom = data; >> > + unsigned long flags; >> > + s_time_t now; >> > + LIST_HEAD(parked); >> > + >> > + spin_lock_irqsave(&sdom->budget_lock, flags); >> > + >> > + /* >> > + * It is possible that the domain overrun, and that the budget >> > hence went >> > + * below 0 (reasons may be system overbooking, issues in or >> > too coarse >> > + * runtime accounting, etc.). In particular, if we overrun by >> > more than >> > + * tot_budget, then budget+tot_budget would still be < 0, >> > which in turn >> > + * means that, despite replenishment, there's still no budget >> > for unarking >> > + * and running vCPUs. >> > + * >> > + * It is also possible that we are handling the replenishment >> > much later >> > + * than expected (reasons may again be overbooking, or issues >> > with timers). >> > + * If we are more than CSCHED2_BDGT_REPL_PERIOD late, this >> > means we have >> > + * basically skipped (at least) one replenishment. >> > + * >> > + * We deal with both the issues here, by, basically, doing >> > more than just >> > + * one replenishment. Note, however, that every time we add >> > tot_budget >> > + * to the budget, we also move next_repl away by >> > CSCHED2_BDGT_REPL_PERIOD. >> > + * This guarantees we always respect the cap. >> > + */ >> > + now = NOW(); >> > + do >> > + { >> > + sdom->next_repl += CSCHED2_BDGT_REPL_PERIOD; >> > + sdom->budget += sdom->tot_budget; >> > + } >> > + while ( sdom->next_repl <= now || sdom->budget <= 0 ); >> >> The first clause ("oops, accidentally missed a replenishment period") >> I >> agree with; >> > Ok. > >> but I'm going back and forth a bit on the second one. It >> means essentially that the scheduler made a mistake and allowed the >> VM >> to run for one full budget *more* than its allocated time (perhaps >> accumulated over several periods). >> > No, the budget does not accumulate. Or at least, it does, but only up > to the original tot_budget. > > So, basically, the reason why budget may still be <0, after a > replenishment of tot_budget, is that something went wrong, and we let > the vcpu overrun for more than tot_budget. > > It really should never happen (I may actually add a WARN()), unless the > accounting is very coarse, or the budget is really small (i.e., the > budget is small compared to the resolution we can achieve for the > accounting). > >> On the one hand, it's the scheduler that made a mistake, so we >> shouldn't >> "punish" a domain by giving it a full period with no budget. >> > Yes, I think I understand what you mean. However, I would not > necessarily call this "punishing" the domain. We're just making sure > that cap is enforced, even during (hopefully sporadic and only > transient) tricky circumstances where the scheduler got something > wrong, and for those domains that have (perhaps not maliciously, but > still) already taken advantage of such mistake. > > In fact, assume you have a domain that wants to execute W amount of > work every T time, but has a cap that results in it having a budget of > C<<W every T. Under normal circumstances, it executes for C between 0 > and T, for C between T and 2T, for C between 2T and 3T, etc., until it > reaches W. So, after 3T, it will have executed for 3C. > > In presence of an accounting/enforcing error, it may happen that it > executes for C between 0 and T, for 2C between T and 2T, for 0 between > 2T and 3T, etc. So, after 3T, it will also have executed for 3C, as > above. Right, but is that what the loop actually does? It looks like now if it executes for 2C between T and 2T, then when the replenishment happens at 2T, the budget will be -C. So the first round of the loop will bring the budget to 0; since budget <= 0, then it will add *another* C to the budget. So then it will be allowed to execute for C again between 2T and 3T, giving it a total of 4C executed over 3T, in violation of the cap. Am I reading that wrong? I thought you had intentionally decided to allow that to happen, to avoid making the capped domain have to sit out for an entire budget period. -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |