[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] long tail latency caused by rate-limit in Xen credit2



On Tue, 2017-06-13 at 14:59 -0500, T S wrote:
> Hi all,
> 
Hi,

Nice to hear from you again... You guys always have interesting things
to say/report/etc., about scheduling... I truly appreciate what you
do! :-)

> When I was executing the latency-sensitive applications in VMs on the
> latest Xen,
> I found the rate limit will cause the long tail latency for VMs
> sharing CPU with other VMs.
> 
Yeah, I totally can see how this can be the case.

Personally, I'm not a fan of context switch rate limiting. Not at all.
But it has proven to be useful in some workloads, so it's good for it
to be there.

I think the scenario you describe is one of those cases where rate
limiting is better be disabled. :-)

> (1) Problem description
> 
> [snip]
>
> (2) Problem analysis
> 
> ------------Analysis----------------
> I read the source code in Xen credit2 scheduler. The vCPU priority
> used in credit1 such as OVER, UNDER, BOOST, is all removed and all
> the
> vCPUs are just ordered by their credit. I traced vCPU credit and the
> I/O-VM vCPU credit is always larger than the CPU-VM credit. So the
> order of I/O-VM vCPU is always ahead of the CPU-VM vCPU.
> 
> Next, I traced the time gap between vCPU wake and vCPU scheduler
> function. I found that if the I/O-VM run alone, the time gap is about
> 3,000ns; however, if the I/O-VM co-run with CPU-VM on the same core,
> the time gap enlarged to 1,000,000ns and that happened in every vCPU
> scheduling. That reminded me the ratelimit in the Xen credit
> scheduler. The default ratelimit in Xen is 1000us.
> 
> As I modified the the ratelimit to 100us in the terminal:
> $ sudo /usr/local/sbin/xl  sched-credit2 -s -r 100
> 
> The average latency is reduced from 300+us to 200+us and the tail
> latency is also reduced.
> 
Ok, good to hear that things are behaving as expected. :-D

> [another snip]
> 
> However, the minimum value of ratelimit is 100us which means there
> still exists the gap between the mix running VMs case and the running
> alone VM case. (P.S. the valid range of ratelimit is from 100 to
> 500000us). To mitigate the latency, the users have to run the I/O VMs
> on a dedicated core but that will waste lots of CPU resources on the
> other hand.
> 
> As an experiment test, I modified the Xen source code to allow the
> ratelimit could be set as 0. As below, here is the result when I set
> the ratelimit to 0. Both average latency and tail latency when
> co-running with CPU-VMs is at the same magnitude and range of that in
> I/O-VM running alone.
> 
Wait... it is already possible to disable ratelimiting. I mean, you're
right, you can't set it to 50us, because, if it's not 0, then it have
to be > 100us.

But you can do:

$ sudo /usr/local/sbin/xl  sched-credit2 -s -r 0

and it will be disabled.

That was possible last time I tried. If it's not right now, then you've
found a bug (I'll double check this tomorrow morning).

> sockperf: ====> avg-lat= 71.766 (std-dev=1.618)
> sockperf: # dropped messages = 0; # duplicated messages = 0; #
> out-of-order messages = 0
> sockperf: Summary: Latency is 71.766 usec
> sockperf: Total 1999 observations; each percentile contains 19.99
> observations
> sockperf: ---> <MAX> observation =  99.257
> sockperf: ---> percentile 99.999 =  99.257
> sockperf: ---> percentile 99.990 =  99.257
> sockperf: ---> percentile 99.900 =   84.155
> sockperf: ---> percentile 99.000 =   78.873
> sockperf: ---> percentile 90.000 =   73.920
> sockperf: ---> percentile 75.000 =   72.546
> sockperf: ---> percentile 50.000 =   71.458
> sockperf: ---> percentile 25.000 =   70.518
> sockperf: ---> <MIN> observation =   63.150
> 
Well, not too bad, considering it's running concurrently with another
VM. It means the scheduler is doing a good job "prioritizing" I/O the
bound workload.

> Similar problem could also be found in credit1 scheduler.
> 
And again, it should be possible to disable ratelimiting while on
Credit1 as well, in a similar manner (I'll check this too).

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.