Xen project Mailing List

Re: [Xen-devel] [BUG] mistakenly wake in Xen's credit scheduler

From: George Dunlap <dunlapg@xxxxxxxxx>

Date: Tue, 27 Oct 2015 09:44:19 +0000

Cc: "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Tue, 27 Oct 2015 09:44:34 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Tue, Oct 27, 2015 at 5:59 AM, suokun <suokunstar@xxxxxxxxx> wrote: > Hi all, > > The BOOST mechanism in Xen credit scheduler is designed to prioritize > VM which has I/O-intensive application to handle the I/O request in > time. However, this does not always work as expected. Thanks for the exploration, and the analysis. The BOOST mechanism is part of the reason I began to write the credit2 scheduler, which we are hoping (any day now) to make the default scheduler. It was designed specifically with the workload you mention in mind. Would you care to try your test again and see how it fares? Also, do you have a patch to fix it in credit1? :-) -George > > > (1) Problem description > -------------------------------- > Suppose two VMs(named VM-I/O and VM-CPU) both have one virtual CPU and > they are pinned to the same physical CPU. An I/O-intensive > application(e.g. Netperf) runs in the VM-I/O and a CPU-intensive > application(e.g. Loop) runs in the VM-CPU. When a client is sending > I/O requests to VM-I/O, its vCPU cannot become BOOST state but obtains > very little CPU cycles(less than 1% in Xen 4.6). Both the throughput > and latency are very terrible. > > > > (2) Problem analysis > -------------------------------- > This problem is due to the wake mechanism in Xen and CPU-intensive > workload will be waked and boosted by mistake. > > Suppose the vCPU of VM-CPU is running and an I/O request comes, the > current vCPU(vCPU of VM-CPU) will be marked as _VPF_migrating. > > static inline void __runq_tickle(unsigned int cpu, struct csched_vcpu *new) > { > ... > if ( new_idlers_empty && new->pri > cur->pri ) > { > SCHED_STAT_CRANK(tickle_idlers_none); > SCHED_VCPU_STAT_CRANK(cur, kicked_away); > SCHED_VCPU_STAT_CRANK(cur, migrate_r); > SCHED_STAT_CRANK(migrate_kicked_away); > set_bit(_VPF_migrating, &cur->vcpu->pause_flags); > __cpumask_set_cpu(cpu, &mask); > } > } > > > next time when the schedule happens and the prev is the vCPU of > VM-CPU, the context_saved(vcpu) will be executed. Because the vCPU has > been marked as _VPF_migrating and it will then be waked up. > > void context_saved(struct vcpu *prev) > { > ... > > if ( unlikely(test_bit(_VPF_migrating, &prev->pause_flags)) ) > vcpu_migrate(prev); > } > > Once the state of vCPU of VM-CPU is UNDER, it will be changed into > BOOST state which is designed originally for I/O-intensive vCPU. If > this happen, even though the vCPU of VM-I/O becomes BOOST, it cannot > get the physical CPU immediately but wait until the vCPU of VM-CPU is > scheduled out. That will harm the I/O performance significantly. > > > > (3) Our Test results > -------------------------------- > Hypervisor: Xen 4.6 > Dom 0 & Dom U: Linux 3.18 > Client: Linux 3.18 > Network: 1 Gigabit Ethernet > > Throughput: > Only VM-I/O: 941 Mbps > co-Run VM-I/O and VM-CPU: 32 Mbps > > Latency: > Only VM-I/O: 78 usec > co-Run VM-I/O and VM-CPU: 109093 usec > > > > This bug has been there since Xen 4.2 and still exists in the latest Xen 4.6. > Thanks. > Reported by Tony Suo and Yong Zhao from UCCS > > -- > > ********************************** >> Tony Suo >> Email: suokunstar@xxxxxxxxx >> University of Colorado at Colorado Springs > ********************************** > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.