Xen project Mailing List

RE: [Xen-devel] [RFC] [PATCH] use "reliable" tsc properly when available, but verify

To: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>

From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>

Date: Mon, 28 Sep 2009 15:05:02 -0700 (PDT)

Cc:

Delivery-date: Mon, 28 Sep 2009 15:05:46 -0700

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] > Surely it should be sufficient to check TSCs for consistency > across all CPUs > periodically, and against the chosen platform timer, and > ensure none are > drifting? An operation which would not require us to loop for > 2ms and would > provide rather more useful information than an ad-hoc multi-CPU > race-to-update-a-shared-variable-an-arbitrary-and-large-number > -of-times. > > I wouldn't take anything like this algorithm. The algorithm ensures that the skew between any two processors is sufficiently small so that it is unobservable by any app (e.g. smaller than "a cache bounce"). I'm not sure it is possible to "check for consistency across all CPUs" and get that guarantee any other way... unless there is some easy way to measure the minimum cost of a cache bounce. I'm not sure why Linux chooses to run the test for 20ms but I think it is because it is only running it once at boottime so it has to eat up some time to give the tsc's a chance to skew sufficiently. If we are running it more than once (and Xen hasn't written to the tsc's recently), it's probably sufficient to run it for far fewer iterations, but given all the possible CPU race conditions due to caches and pipelining and such, I'm not sure how many iterations is enough. Note that upstream Linux NEVER writes to TSC anymore. If the check_tsc_warp test fails, tsc is simply marked as an unreliable mechanism other than for interpolating within a jiffie. If OS's had some intrinsic to describe this "reliable vs unreliable TSC" to apps, lots of troubles could have been avoided. But that's roughly what I am trying to do with pvrdtscp so I'm trying to be very sure that when Xen says it is, TSC is both reliable and continues to be reliable. (Though maybe once at boottime is sufficient.) Which points out another alternative: check_tsc_warp need only be run if one or more domains have tsc_native enabled AND have some mechanism (such as pvrdtscp or a userland hypercall) to ask Xen if the TSC is reliable or not. But since this might be minutes/hours/days after Xen boots, I'd still like to avoid Xen mucking around using write_tsc in the meantime as it may be "fixing" something that ain't broke. > I should add, not only is the algorithm stupid and slow, but > it doesn't even > check for exactly what RELIABLE_TSC guarantees -- > constant-rate TSCs. This > would be useless on a single-CPU system, for example, or perhaps more > practically a single-socket system where all TSCs skewed > together due to > package-wide power management. In the latter case TSCs would not skew > relative to each other, even though they could 'skew' > relative to wallclock > (represented in Xen by the platform timer). It's only checking for TSC skew relative to other processors in an SMP system. What's important to an app is that time (as measured by sampling the TSC on random processors) never goes backwards. That IS what RELIABLE_TSC is supposed to guarantee. I agree that check_tsc_warp doesn't test for skew relative to a platform timer (though I suspect they are driven from the same crystal) and need not be run on a single-CPU system. Dan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.