[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Scheduling anomaly with 4.0.0 (rc6)
I've been running some heavy testing on a recent Xen 4.0 snapshot and seeing a strange scheduling anomaly that I thought I should report. I don't know if this is a regression... I suspect not. System is a Core 2 Duo (Conroe). Load is four 2-VCPU EL5u4 guests, two of which are 64-bit and two of which are 32-bit. Otherwise they are identical. All four are running a sequence of three Linux compiles with (make -j8 clean; make -j8). All are started approximately concurrently: I synchronize the start of the test after all domains are launched with an external NFS semaphore file that is checked every 30 seconds. What I am seeing is a rather large discrepancy in the amount of time consumed "underway" by the four domains as reported by xentop and xm list. I have seen this repeatedly, but the numbers in front of me right now are: 1191s dom0 3182s 64-bit #1 2577s 64-bit #2 <-- 20% less! 4316s 32-bit #1 2667s 32-bit #2 <-- 40% less! Again these are identical workloads and the pairs are identical released kernels running from identical "file"-based virtual block devices containing released distros. Much of my testing had been with tmem and self-ballooning so I had blamed them for awhile, but I have reproduced it multiple times with both of those turned off. At start and after each kernel compile, I record a timestamp, so I know the same work is being done. Eventually the workload finishes on each domain and intentionally crashes the kernel so measurement is stopped. At the conclusion, the 64-bit pair have very similar total CPU sec and the 32-bit pair have very similar total CPU sec so eventually (presumably when the #1's are done hogging CPU), the "slower" domains do finish the same amount of work. As a result, it is hard to tell from just the final results that the four domains are getting scheduled at very different rates. Does this seem like a scheduler problem, or are there other explanations? Anybody care to try to reproduce it? Unfortunately, I have to use the machine now for other work. P.S. According to xentop, there is almost no network activity, so it is all CPU and VBD. And the ratio of VBD activity looks to be approximately the same ratio as CPU(sec). _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |