[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Fwd: xen: credit2: credit2 can’t reach the throughput as expected



Hi, Dario
[sorry for the html email format, resend by text.]

> On Fri, 2019-02-15 at 06:15 +0000, zheng chuan wrote:
> > Hi, Dario,
> >
> Hi,
>
> > Here is the xentrace in credit2 with ratelimiting 1 ms and 30ms by
> > observing 1 seconds both.
> >
> Ok, thanks a lot for doing this! I'm doing my own experiments, but slower than
> I wanted to, as I am also a little busy with other things... so I really 
> appreciate
> your efforts. :-)
>
> > Roughly, we can see the frequency of the context switch.
> > The context switch decreases significantly when the ratelimiting
> > changes from 1ms to 30ms linux-EBkjWt:/home # cat credit2_r_1000.log |
> > grep __enter_scheduler
> > | wc -l
> > 2407
> > linux-EBkjWt:/home # cat credit2_r_30000.log | grep __enter_scheduler
> > | wc -l
> > 714
> >
> Well, sure, that's expected. It is, indeed, the intended effect of having
> ratelimiting in the first place.
>
> Now, can I ask you a favour? Can you rerun with:
>
> sched_credit2_migrate_resist=0
>
> added to Xen's boot command line?
>
> Not that I expect "miracles" (things might even get worse!), but looking at 
> the
> traces, I got curios of what kind of effect that could have.
>
Unfortunately, sched_credit2_migrate_resist=0 seems do not work :(
It still around 60% and 120% for guest_1 and guest_2 with ratelimiting of 1ms, 
respectively

linux-sodv:~ # xl dmesg | grep credit2
(XEN) Command line: sched=credit2 sched_credit2_migrate_resist=0
(XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2)

xentop - 11:19:02 Xen 4.11.0
4 domains: 1 running, 3 blocked, 0 paused, 0 crashed, 0 dying, 0 shutdown
Mem: 67079796k total, 67078844k used, 952k free CPUs: 32 @ 2600MHz
NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k) MAXMEM(%) VCPUS NETS 
NETTX(k) NETRX(k) VBDS VBD_OO VBD_RD VBD_WR VBD_RSECT VBD_WSECT SSID
Domain-0 -----r 111 8.6 64051764 95.5 no limit n/a 32 0 0 0 0 0 0 0 0 0 0
guest_1 --b--- 38 61.8 1048832 1.6 1049600 1.6 4 1 374 4 1 0 4116 144 191722 
10420 0
guest_2 --b--- 74 122.1 1048832 1.6 1049600 1.6 8 1 387 2 1 0 4289 144 191835 
10506 0
Xenstore --b--- 0 0.0 32760 0.0 670720 1.0 1 0 0 0 0 0 0 0 0 0 0

> Also, for both the Credit1 and Credit2 cases, are you touching power
> management(like with `xenpm`)?
>

No, the power management is set as default.

> > Since we also complement credit for sleeper vcpus to guarantee the
> > fairness (also sched_latency of sleeper vcpus) once we trigger the
> > reset_credit.
> > it does not look like suitable for some workload such like the case in
> > this issue, Is that possible we try to do some punishment for the
> > sleepers or complement credit in other policy to avoid too much
> > preemption?
> >
> You keep mentioning "sleepers" or "sleeping vcpus", but I don't understand
> this part. A sleeping vcpu, even if it has the highest credits, due to a 
> reset,
> won't preempt any running vcpus.
>
> It will (likely) preempt one when it wakes up, but that also happens on
> Credit1 due to boosting (well, in theory... unless everyone is always boosted,
> at which point things are hard to predict).
>
> > We sacrifice throughput for the sched_latency by theory, However,
> > what's interesting is that, as I said before, if I don't complement
> > credit for sleepers or enlarge the ratelimiting, the sched_latency may
> > not get worse If the vcpus runs staggeredly which spread into pCPUs
> > when they are in idle at most of time due to the stable running
> > pattern in my demo.
> >
> But can we actually try to measure latency as well? Because it looks to me 
> that
> we're discussing while having only half of the picture available.
>
Sure, but due to my lack of knowledge, does xen have sched_latency measurement
tools like perf sched_latency for CFS ? I will give it try to do it if I get it.

> Also, since you said you tried, can you show me (in code, I mean) what you
> mean with "if I don't complement credit for sleepers", in order for me to
> better understand what you mean with that?
>
> Thanks again for you work and Regards,
> Dario
> --

What I am tried is rude and empirical, which is shown as bellow:

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 9a3e71f..b781ebe 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1642,13 +1642,13 @@ static void reset_credit(const struct scheduler *ops, 
int cpu, s_time_t now,
if ( snext->credit < -CSCHED2_CREDIT_INIT )
m += (-snext->credit) / CSCHED2_CREDIT_INIT;

- list_for_each( iter, &rqd->svc )
+ list_for_each( iter, &rqd->runq )
{
unsigned int svc_cpu;
struct csched2_vcpu * svc;
int start_credit;

- svc = list_entry(iter, struct csched2_vcpu, rqd_elem);
+ svc = list_entry(iter, struct csched2_vcpu, runq_elem);
svc_cpu = svc->vcpu->processor;

ASSERT(!is_idle_vcpu(svc->vcpu));

It works for this workload, which can reach the throughput I expect.
However, I can't find any theory to support it for now, since it affects 
fairness from my point of view.
But maybe it's an evidence that if we complement credit for the sleeping vcpus 
in a suitable way that can do benefit for the throughput of this kind periodic 
workload.

Best regards.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.