Xen project Mailing List

Re: [Xen-devel] [RFC] Add static priority into credit scheduler

To: "Su, Disheng" <disheng.su@xxxxxxxxx>

From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>

Date: Fri, 20 Mar 2009 12:42:47 +0000

Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, NISHIGUCHI Naoki <nisiguti@xxxxxxxxxxxxxx>

Delivery-date: Fri, 20 Mar 2009 05:43:16 -0700

Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=Dc4kcLfQrjWvtrFCJZx4wwU+JPyO+RwdLlCTSxmZbWFlRvR8OowVSZlMWmKz/n2x3j +33XTsa1p6GvleRM4DRbNxMtWTwQ2eEwj30ngRRVBEyNmLhvnNFSoqtcylBPDBDoP3eo IyU3yz9RdOYxX2/9Pgc60oqn5zSM9TaTf5xSo=

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

So, just to be clear: you're proposing that this mechanism *might* be useful for a VM with real-time scheduling requirements? Or are actually working on / developing real-time operating systems, and are suggesting this in order to support real-time VMs? I'm not an expert in real-time scheduling, but it doesn't seem to me like this will really be what a real-time system would want. (Feel free to contradict me if you know better.) It might work OK if there were only a single real-time PV guest, but in the face of competition, you'd have trouble. It seems like an actual real-time Xen scheduler would want the PV guests to submit deadlines to Xen, and then Xen could try to make a decision as to which deadlines to drop if it needs to (based on some mechanism). The only test you've measured is networking; but networking isn't a "real-time" workload, it's a latency-sensitive workload. And you haven't measured: * The effect on network traffic if you have several high-priority VMs competing * The effect on network traffic of non-prioritized VMs if a high-priority VM is receiving traffic, or is misbehaving You also haven't compared how raising a VM's priority within the current credit framework, such as giving it a very high weight, affects the numbers. Can you get similar results if you were to give the "latency-sensitive" VMs a weight of, say, 10000, and leave the other ones at 256? Overall, I don't think fixed priorities like this is a good solution: I think it will create more problems than it solves, and I think it's actually harder to predict how a complex system will actually behave (and thus harder to configure properly). I think the proper solution (and I'm working on a "credit2" scheduler that has these properites) is: * Fix the credit assignment, so that VMs don't spend very much time in "over" * Give VMs that wake up and are under their credits a fixed "boost" period (e.g., 1ms) * Allow users to specify a cpu "reservation"; so that no matter how much work there is on the system, a VM can be guaranteed to get a minimum fixed amount of the cpu if it wants it; e.g., dom0 always gets 50% of one core if it wants it, no matter how many other VMs are on the system. #1 and #2 have resulted in significant improvements in TCP throughput in the face of competition. I hope to publish a draft here on the list sometime soon, but I'm still working out some of the details. -George Dunlap 2009/3/20 Su, Disheng <disheng.su@xxxxxxxxx>: > Hi all, > Attached patches add static priority into credit scheduler. > Currently, credit scheduler has 4 kinds of priority: BOOST, UNDER, > OVER and IDLE. And the priority of VM is dynamically changed according to the > credit of VM, or I/O events, the highest priority VM is chosed to be > scheduled in for each scheduling period. Due to priority is not fixed, which > VM will be scheduled in is properly unknown. The I/O latency caused by > scheduler is well analyzed in [1] and [2]. They provides ways to reduce I/O > latency and also retain CPU and I/O fairness between VMs to some extend. > There are some cases that reducing latency is much preferable to CPU > or I/O fairness, such as RTOS guest or VM with device(audio)-assigned. The > straightforward way is to set static(fixed) highest priority for this VM, to > make sure it is scheduled each time. Attached patches implemented this kind > of mechanism, like SCHED_RR/SCHED_FIFO in Linux. > > How it works? > --Users can set RT priority(between 1~100) for domains. The larger the > number, the higher the priority. Users can also change a RT domain into a > non-RT domain by setting its priority other than 1~100. > --Scheduler always chooses the highest priority domain to run for RT > domains, no changes for non-RT domains in there. If RT domains have the same > priority, round robin between this domains for every 30ms. 30ms is the > default scheduling period, it can be changed to 2ms or other value if needed. > --There is still accounting for current running non-RT vcpu in every > 10ms, accounting for all non-RT domains in every 30ms as credit scheduler did > before. > > Implementation details: > -- In order to minimize the modification in the credit scheduler, one > additional rt runqueue per pcpu is added, and one rt active domain list added > in csched_private. RT vcpus are added into the rt runqueue in the running > pcpu, and rt domains are added into rt active domain. > -- Scheduler always chooses the highest priority in the rt runqueue > if it's not empty at first, then chooses from normal runqueue instead. > --__runq_insert/__runq_remove are changed to based on the priority of > vcpu. > -- Vcpu accounting is only took effects on the non-RT vcpus as > before. Non-RT vcpus propotionally share the rest of cpu based on their > weight. The total weight is changed during adding/removing RT domains, e.g. > promoting a non-RT domain to a RT domain, total weight is substracted by the > weight of non-RT domain. > > How to use it: > set priority(y) of a VM(x) by: "xm sched-credit -d x -p y" > > Test results: > I did some tests with this patches according to following > configuration: > CPU: Intel Core 2 Duo E6850, Xen(1881), 7 VMs created on one > physical machine A, each 2 VMs pair ping with each other, the other VM has RT > priority. Another physical machine B connects with it through 1G network card > directly. Conduct these tests from B to A, e.g ping A from B. > some test results are uploaded to > http://wiki.xensource.com/xenwiki/DishengSu, FYI. > > Summary: > This patches minimize the scheduling latency, while losing CPU, or I/O > fairness. It can be used as a scheduler for RT guest, for some cases(such as > RT guest and non-RT guests co-exist). While there are lot of areas to improve > real time response, such as interrupt latency, Xen I/O model[3]. > Any comments are appreciated. Thanks! > > --------------------- > [1]Scheduling I/O in Virtual Machine Monitors > [2]Evaluation and Consideration of the Credit Scheduler for Client > Virtualization > [3]A step to support real-time in virtual machine > > Best Regards, > Disheng, Su > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.