[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen: credit2: credit2 can’t reach the throughput as expected



On Tue, 2019-02-12 at 13:36 +0000, 郑 川 wrote:
> Hi, George,
> 
Hi (although I'm not George :-D),

> I found Credit2 can’t reach the throughput as expected under my test
> workload, compared to Credit and CFS. It is easy to reproduce, and I
> think the problem is really exist.
> It really took me a long time to find out why due to my lack of
> knowledge, and I cannot find a good way to solve it.
> Please do help to take a look at it. Thx.
> 
Ok, thanks for your testing, and for reporting this to us.

A few questions.

> ***************
> [How to reproduce]
> ***************
> I use openSUSE-Tumbleweed with xen-4.11 version.
> Here is the test workload like:
> I have guest_1 with 4 vCPU and guest_2 with 8 vCPU running on 4 pCPU,
> that is, the relation of pCPU:vCPU is 1:3.
> Then I add pressure with 20% CPU usage for each vCPU, which results
> in total 240% pCPU usage.
> The 20% pressure model is that, I start one process on each vCPU,
> which runs 20ms indefinitely and then goes to sleep 80ms within the
> period of 100ms.
> I use xentop to observe guest cpu usage in dom0, as I expect, the
> guest cpu usage is 80% and 160% for guest_1 and guest_2 ,
> respectively.
> 
Do you have the sources for this somewhere, so that we can try to
reproduce it ourself. I'm thinking to the source code for the periodic
apps (if you used a custom made one), or the repository (if you used
one from any) or the name of the benchmarking suite --and the parameter
used to create this scenario?

> **************
> [Why it happens]
> **************
> The test workload likes the polling from the long term to see.
> As showed in the figure below, the - - - - means the cputime the
> vcpus is running and the ——— means the idle.
> As we can see from Fig.1, if vcpu_1 and vcpu_2 can run staggeredly,
> the throughput looks fine, however, if vcpu_1 and vcpu_2 runs at the
> same time, they will compete for pCPU, which results in poor
> throughput.
> 
> vcpu_1        - - - - - - - ————————  - - - - - - - ———————— - - - -
> -
>                   |                |                               | 
>               |                               |
> vcpu_2                        - - - - - - - ————————  - - - - - - -
> ————————
>                   |  vcpu1    |   vcpu2   |               |  vcpu1   
>  |   vcpu2   |              |  vcpu1
> cpu usage   - - - - - - - - - - - - -  ————- - - - - - - - - - - - -
> - ———— - - - - - - - 
>                                                            Fig.1
> 
> vcpu_1       - - - - - - - ————————                  - - - - - - -
> ———————
>                  |
> vcpu_2       - - - - - - - ————————                  - - - - - - -
> ———————
>                  |  compete running     |   both
> sleep         |  compete running    |   both sleep    |        
> cpu usage   - - - - - - - - - - - - - - ———————— - - - - - - - - - -
> - - - - ————————
>                                                            Fig.2
> 
Ok, I'm not entirely sure I follow all this, but let's put it aside for
a second. The question I have is, is this analysis coming from looking
at actual traces? If yes, can you upload somewhere/share the trace
files?

> As we do reset_credit() when snext->credit is negative which makes
> the credit value is too close between each vcpu.
> As a result, from long term to observe, the time-slice of each vcpu
> becomes smaller, they compete for pCPU at the same time just like
> shown in Fig.2 above.
> Thus, i think the reason why it can't reach the expected throughput
> is that reset_credit() for all vcpu will make the time-slice smaller
> which is different from
> Credit and CFS.
> 
Ok, so you're saying this drop of "throughput" can be caused by
scheduling happening too frequently in Credit2.

Well, I guess that is a possibility, although, as I said above, I'd
need to think a bit more about this, as well as trying to reproduce it,
and look at actual traces.

Perhaps, one thing that can be done to try to confirm this analysis,
would be to make the scheduling less frequent in Credit2 and, on the
other hand, to make it more frequent in Credit1. In theory, if the
analysis is correct, you would observe the behavior of this specific
workload improving on Credit2 and degrading in Credit1, when doing so.

If you fancy trying that, for Credit1, you can play with the
sched_credit_tslice_ms Xen boot time parameter (e.g., try pushing it
down to 1ms).

For Credit2, it's a little trickier, as the scheduler does not have a
real timeslice. So, either you alter CSCHED2_MIN_TIMER, in the code, or
you "mimic" the timeslice increase by setting sched_ratelimit_us to a
higher value (like, e.g., 10ms).

It's not a conclusive test, but I think it is a good enough one for
gaining some more understanding of the issue.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.