|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [BUG] mistakenly wake in Xen's credit scheduler
On Tue, Oct 27, 2015 at 3:44 AM, George Dunlap <dunlapg@xxxxxxxxx> wrote:
> On Tue, Oct 27, 2015 at 5:59 AM, suokun <suokunstar@xxxxxxxxx> wrote:
>> Hi all,
>>
>> The BOOST mechanism in Xen credit scheduler is designed to prioritize
>> VM which has I/O-intensive application to handle the I/O request in
>> time. However, this does not always work as expected.
>
> Thanks for the exploration, and the analysis.
>
> The BOOST mechanism is part of the reason I began to write the credit2
> scheduler, which we are hoping (any day now) to make the default
> scheduler. It was designed specifically with the workload you mention
> in mind. Would you care to try your test again and see how it fares?
>
Hi, George,
Thank you for your reply. I have test credit2 this morning. The I/O
performance is correct, however, the CPU accounting seems not correct.
Here is my experiment on credit2:
VM-IO: 1-vCPU pinned to a pCPU, running netperf
VM-CPU: 1-vCPU pinned the the same pCPU, running a while(1) loop
The throughput of netperf is the same(941Mbps) as VM-IO runs alone.
However, when I use xl top to show the VM CPU utilization, VM-IO takes
73% of CPU time and VM-CPU takes 99% CPU time. Their sum is more than
100%. I doubt it is due to the CPU utilization accounting in credit2
scheduler.
> Also, do you have a patch to fix it in credit1? :-)
>
For the patch to my problem in credit1. I have two ideas:
1) if the vCPU cannot migrate(e.g. pinned, CPU affinity, even only has
one physical CPU), do not set the _VPF_migrating flag.
2) let the BOOST state can preempt with each other.
Actually I have tested both separately and they both work. But
personally I prefer the first option because it solved the problem
from the source.
Best
Tony
> -George
>
>>
>>
>> (1) Problem description
>> --------------------------------
>> Suppose two VMs(named VM-I/O and VM-CPU) both have one virtual CPU and
>> they are pinned to the same physical CPU. An I/O-intensive
>> application(e.g. Netperf) runs in the VM-I/O and a CPU-intensive
>> application(e.g. Loop) runs in the VM-CPU. When a client is sending
>> I/O requests to VM-I/O, its vCPU cannot become BOOST state but obtains
>> very little CPU cycles(less than 1% in Xen 4.6). Both the throughput
>> and latency are very terrible.
>>
>>
>>
>> (2) Problem analysis
>> --------------------------------
>> This problem is due to the wake mechanism in Xen and CPU-intensive
>> workload will be waked and boosted by mistake.
>>
>> Suppose the vCPU of VM-CPU is running and an I/O request comes, the
>> current vCPU(vCPU of VM-CPU) will be marked as _VPF_migrating.
>>
>> static inline void __runq_tickle(unsigned int cpu, struct csched_vcpu *new)
>> {
>> ...
>> if ( new_idlers_empty && new->pri > cur->pri )
>> {
>> SCHED_STAT_CRANK(tickle_idlers_none);
>> SCHED_VCPU_STAT_CRANK(cur, kicked_away);
>> SCHED_VCPU_STAT_CRANK(cur, migrate_r);
>> SCHED_STAT_CRANK(migrate_kicked_away);
>> set_bit(_VPF_migrating, &cur->vcpu->pause_flags);
>> __cpumask_set_cpu(cpu, &mask);
>> }
>> }
>>
>>
>> next time when the schedule happens and the prev is the vCPU of
>> VM-CPU, the context_saved(vcpu) will be executed. Because the vCPU has
>> been marked as _VPF_migrating and it will then be waked up.
>>
>> void context_saved(struct vcpu *prev)
>> {
>> ...
>>
>> if ( unlikely(test_bit(_VPF_migrating, &prev->pause_flags)) )
>> vcpu_migrate(prev);
>> }
>>
>> Once the state of vCPU of VM-CPU is UNDER, it will be changed into
>> BOOST state which is designed originally for I/O-intensive vCPU. If
>> this happen, even though the vCPU of VM-I/O becomes BOOST, it cannot
>> get the physical CPU immediately but wait until the vCPU of VM-CPU is
>> scheduled out. That will harm the I/O performance significantly.
>>
>>
>>
>> (3) Our Test results
>> --------------------------------
>> Hypervisor: Xen 4.6
>> Dom 0 & Dom U: Linux 3.18
>> Client: Linux 3.18
>> Network: 1 Gigabit Ethernet
>>
>> Throughput:
>> Only VM-I/O: 941 Mbps
>> co-Run VM-I/O and VM-CPU: 32 Mbps
>>
>> Latency:
>> Only VM-I/O: 78 usec
>> co-Run VM-I/O and VM-CPU: 109093 usec
>>
>>
>>
>> This bug has been there since Xen 4.2 and still exists in the latest Xen 4.6.
>> Thanks.
>> Reported by Tony Suo and Yong Zhao from UCCS
>>
>> --
>>
>> **********************************
>>> Tony Suo
>>> Email: suokunstar@xxxxxxxxx
>>> University of Colorado at Colorado Springs
>> **********************************
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxx
>> http://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |