[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Introduce rt real-time scheduler for Xen

This serie of patches adds rt real-time scheduler to Xen.

In summary, It supports:
1) Preemptive Global Earliest Deadline First scheduling policy by using a 
global RunQ for the scheduler;
2) Assign/display each VCPU's parameters of each domain;
3) Supports CPU Pool

The design of this rt scheduler is as follows:
This rt scheduler follows the Preemptive Global Earliest Deadline First (GEDF) 
theory in real-time field.
Each VCPU can have a dedicated period and budget. While scheduled, a VCPU burns 
its budget. Each VCPU has its budget replenished at the beginning of each of 
its periods; Each VCPU discards its unused budget at the end of each of its 
periods. If a VCPU runs out of budget in a period, it has to wait until next 
The mechanism of how to burn a VCPU's budget depends on the server mechanism 
implemented for each VCPU.
The mechanism of deciding the priority of VCPUs at each scheduling point is 
based on the Preemptive Global Earliest Deadline First scheduling scheme.

Server mechanism: a VCPU is implemented as a deferrable server.
When a VCPU has a task running on it, its budget is continuously burned;
When a VCPU has no task but with budget left, its budget is preserved.

Priority scheme: Global Earliest Deadline First (EDF).
At any scheduling point, the VCPU with earliest deadline has highest priority.

Queue scheme: A global runqueue for each CPU pool.
The runqueue holds all runnable VCPUs.
VCPUs in the runqueue are divided into two parts: with and without remaining 
At each part, VCPUs are sorted based on GEDF priority scheme.

Scheduling quanta: 1 ms; but accounting the budget is in microsecond.

One scenario to show the functionality of this rt scheduler is as follows:
//list each vcpu's parameters of each domain in cpu pools using rt scheduler
#xl sched-rt
Cpupool Pool-0: sched=EDF
Name                                ID VCPU Period Budget
Domain-0                             0    0     10     10
Domain-0                             0    1     20     20
Domain-0                             0    2     30     30
Domain-0                             0    3     10     10
litmus1                              1    0     10      4
litmus1                              1    1     10      4

//set the parameters of the vcpu 1 of domain litmus1:
# xl sched-rt -d litmus1 -v 1 -p 20 -b 10

//domain litmus1's vcpu 1's parameters are changed, display each VCPU's 
parameters separately:
# xl sched-rt -d litmus1
Name                                ID VCPU Period Budget
litmus1                              1    0     10      4
litmus1                              1    1     20     10

// list cpupool information
xl cpupool-list
Name               CPUs   Sched     Active   Domain count
Pool-0              12        rt       y          2

//create a cpupool test
#xl cpupool-cpu-remove Pool-0 11
#xl cpupool-cpu-remove Pool-0 10
#xl cpupool-create name=\"test\" sched=\"credit\"
#xl cpupool-cpu-add test 11
#xl cpupool-cpu-add test 10
#xl cpupool-list
Name               CPUs   Sched     Active   Domain count
Pool-0              10        rt       y          2
test                 2    credit       y          0

//migrate litmus1 from cpupool Pool-0 to cpupool test.
#xl cpupool-migrate litmus1 test

//now litmus1 is in cpupool test
# xl sched-credit
Cpupool test: tslice=30ms ratelimit=1000us
Name                                ID Weight  Cap
litmus1                              1    256    0

The differences between this new rt real-time scheduler and the sedf scheduler 
are as follows:
1) rt scheduler supports global EDF scheduling, while sedf only supports 
partitioned scheduling. With the support of vcpu mask, rt scheduler can also be 
used as partitioned scheduling by setting each VCPUâs cpumask to a specific cpu.
2) rt scheduler supports setting and getting each VCPUâs parameters of a 
domain. A domain can have multiple vcpus with different parameters, rt 
scheduler can let user get/set the parameters of each VCPU of a specific 
domain; (sedf scheduler does not support it now)
3) rt scheduler supports cpupool.
4) rt scheduler uses deferrable server to burn/replenish budget of a VCPU, 
while sedf uses constrant bandwidth server to burn/replenish budget of a VCPU. 
This is just two options of implementing a global EDF real-time scheduler and 
both optionsâ real-time performance have already been proved in academic.

(Briefly speaking, the functionality that the *SEDF* scheduler plans to 
implement and improve in the future release has already been supported in this 
rt scheduler.)
(Although itâs unnecessary to implement two server mechanisms, we can simply 
modify the two functions of burning and replenishing vcpusâ budget to 
incorporate the CBS server mechanism or other server mechanisms into this rt 

1) Improve the code of getting/setting each VCPUâs parameters. [easy]
    Right now, it create an array with LIBXL_XEN_LEGACY_MAX_VCPUS (i.e., 32) 
elements to bounce all VCPUsâ parameters of a domain between xen tool and xen 
to get all VCPUsâ parameters of a domain. It is unnecessary to have 
LIBXL_XEN_LEGACY_MAX_VCPUS elements for this array.
    The current work is to first get the exact number of VCPUs of a domain and 
then create an array with that exact number of elements to bounce between xen 
tool and xen.
2) Provide microsecond time precision in xl interface instead of millisecond 
time precision. [easy]
    Right now, rt scheduler let user to specify each VCPUâs parameters (period, 
budget) in millisecond (i.e., ms). In some real-time application, user may want 
to specify VCPUsâ parameters in  microsecond (i.e., us). The next work is to 
let user specify VCPUsâ parameters in microsecond and count the time in 
microsecond (or nanosecond) in xen rt scheduler as well.
3) Add Xen trace into the rt scheduler. [easy]
    We will add a few xentrace tracepoints, like TRC_CSCHED2_RUNQ_POS in 
credit2 scheduler, in rt scheduler, to debug via tracing.
4) Method of improving the performance of rt scheduler [future work]
    VCPUs of the same domain may preempt each other based on the preemptive 
global EDF scheduling policy. This self-switch issue does not bring benefit to 
the domain but introduce more overhead. When this situation happens, we can 
simply promote the current running lower-priority VCPUâs priority and let it  
borrow budget from higher priority VCPUs to avoid such self-swtich issue.

Timeline of implementing the TODOs:
We plan to finish the TODO 1), 2) and 3) within 3-4 weeks (or earlier).
Because TODO 4) will make the scheduling policy not pure GEDF, (people who 
wants the real GEDF may not be happy with this.) we look forward to hearing 
peopleâs opinions.

Special huge thanks to Dario Faggioli for his helpful and detailed comments on 
the preview version of this rt scheduler. :-)

Any comment, question, and concerns are more than welcome! :-)

Thank you very much!


[PATCH RFC v1 1/4] rt: Add rt scheduler to hypervisor
[PATCH RFC v1 2/4] xl for rt scheduler
[PATCH RFC v1 3/4] libxl for rt scheduler
[PATCH RFC v1 4/4] libxc for rt scheduler

Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.