[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Time skew on HP DL785 (and possibly other boxes)
(Raising a yellow flag because this could turn into a serious issue for Xen and it may take quite a bit of work to come up with a solution.) We recently measured Xen system time skew on an HP DL785 and found it to be horrible... nearly a quarter millisecond worst case (with only about 10000 samples so it may get worse). This box uses 8 quad-core AMD chips connected via hypertransport. BUT each chip is on a separate motherboard. On this system hypertransport is fast and cross-node memory accesses are fast enough so that these NUMA systems need not behave like NUMA systems from a memory access perspective. So Xen just views the system as a 32-cpu box (other than some code in the memory allocator that tries to allocate near-memory where possible, but silently falls back to far-memory if necessary) and guest vcpus migrate freely between the nodes. (Correct?) However, I'm told that its not possible to route a clocksource over hypertransport, so TSC's on processors on different motherboards may be VERY different and apparently the mechanisms for synchronizing Xen system time across motherboards may not be up to the challenge. As a result, OS's and apps sensitive to time that are running on PV domains may be in for a rough ride on systems like this. (HVM domains may run into other problems because time will apparently stop for a "long time".) Since systems like this are targeted for consolidation and virtualization, I see this as a potentially big problem as it may appear to real Xen customers as bizarre non-reproducible problems, such as "make" failing, leading to questions about the stability and viability of using Xen. Comments? Dan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |