[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] credit2 question



On 24/01/13 10:42, Jan Beulich wrote:
On 24.01.13 at 11:07, George Dunlap <george.dunlap@xxxxxxxxxxxxx> wrote:
On 24/01/13 10:06, Jan Beulich wrote:
On 24.01.13 at 10:49, George Dunlap <george.dunlap@xxxxxxxxxxxxx> wrote:
On 24/01/13 07:40, Jan Beulich wrote:
George,

I'm getting puzzled by the second c2t() invocation in
csched_runtime(): Why is the difference of credits being passed
here? Doesn't that (unless svc->credit is non-positive, i.e. in all
but unusual cases) guarantee time > ntime, and particularly
allow for negative ntime?
Ah, right -- yes, if the other guys' credit is positive, "ntime" is
guaranteed to be lower.  Since c2t() involves integer division, it would
definiteyl be good to get rid of the extra call if we can.

My general principle is to make the code clear and easily readable
first, and then do optimization afterwards -- in this case I just never
came back and did the optimization step.
Oh, I wasn't thinking of just the optimization. It seemed wrong to
me to do the subtraction there in the first place: "time" is being
calculated from a plain value, so why would "ntime" be calculated
from a delta?
Ah, right -- so the idea here was to run until snext->credit was equal
to svc->credit.  That's why the delta.
Which then means that under normal circumstances you would
always only run each vCPU for CSCHED_MIN_TIMER, which
seems quite odd.

Only when both domains are burning cpu -- which in my tests, even for very busy domains, was very rarely the case. This is particularly in the case of HVM domains, where there are regular MMIO accesses that throw everything into a bit of a kilter. :-)

Wouldn't it be more fair to do e.g.

             if ( time > ntime )
                 time = (time + ntime) / 2;

since otherwise at the expiry of the time the two vCPU-s have
equal credit, whereas you would generally expect a vCPU that
just finished running to have lower credit than the next one to
run?

The divide by two thing would either get just variations of runtimes, or always MAX_TIMER. Suppose we have "burner" vcpus v1 and v2 (and we didn't have the 'max'): Just after the "reset", everyone's credit is around 10ms; so this would cause v1 to run for (10+(10-10))/2 == 5ms, then v2 to run for (10+(10-5))/2=7.5ms (!), then v1 to run for (5+(5-2.5))/2 = 3.75ms, &c.

We could do the average after applying MAX, but then you just get a "steady state" for burners of (MIN+MAX)/2.

In any case, scheduling is like economics: apparently simple minor changes can have really big effects, so you need to be very careful about changing things without doing experiments first; and I don't really have time at the moment. The current one is known to be at least "not awful", so I think we should just stick with it until someone can take a proper look at potential alternatives. :-)

But as you validly said earlier, avoiding the c2t() in cases where we
can tell up front that "time" would end up below CSCHED_MIN_TIMER
(particularly zero or negative) would be desirable. I'd prefer to
leave doing that to you though.

I've already started. :-)

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.