Xen project Mailing List

Re: [Xen-devel] rdtscP and xen (and maybe the app-tsc answer I've been looking for)

To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>

From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>

Date: Mon, 21 Sep 2009 16:55:53 -0700

Cc: kurt.hackel@xxxxxxxxxx, "Xen-Devel \(E-mail\)" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxxxx>

Delivery-date: Mon, 21 Sep 2009 16:56:16 -0700

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On 09/21/09 16:29, Dan Magenheimer wrote: >>> I think a race occurs if the vcpu switches pcpu TWICE >>> from pcpu-A to pcpu-B and back to pcpu-A and does rdtscp >>> each time on pcpu-A but reads one or more pvclock parameters >>> (that are too big to be encoded in TSC_AUX) on pcpu-B. >>> >> That shouldn't matter. Once the process has (tsc,cpu,version) it can >> use its own local copy of cpu's pvclock parameters to compute the >> tsc->ns conversion. Once it has that triple, it doesn't matter if it >> gets context-switched; the time computation doesn't depend on what CPU >> is currently running. >> >> It only needs to iterate if it gets a version mismatch. You can >> potentially get a livelock if the version is constantly >> changing between >> the rdtscp and the get-pvclock-params, and exacerbated if the process >> keeps bouncing between cpus between the two. But given that the >> rdtsc+get-params should take no more than a couple of microseconds, it >> seems very unlikely the process is sustaining a megahertz CPU >> migration >> rate. >> > Yes, I neglected an important pre-condition. ASSUME the first > rdtscp on pcpu-A gets a version mismatch so that it must fetch > the parameters again. Then: the vcpu switches pcpu TWICE > from pcpu-A to pcpu-B and back to pcpu-A and does rdtscp > each time on pcpu-A but reads one or more pvclock parameters > (that are too big to be encoded in TSC_AUX) on pcpu-B. > > I agree that this is vanishingly low probability but on > a pcpu-oversubscribed machine I think it only takes one > vcpu-to-pcpu reschedule and then a poorly timed interrupt that > causes the vcpu to be unscheduled, and then later rescheduled > on the original processor. > Sure. It just has to keep iterating until it gets consistency. If it iterates too long (10 times? 100? 1000?) it should give up and assume something is inherently broken. >> And even if it fails, the process always has to be prepared to go to >> some other time source. >> > And the issue is that there's no way to recognize > failure. Yeah, that's a basic problem with using naked tsc as a timebase. Any app using it needs to be prepared to test the tsc sanity against some other time reference regularly. On the other hand, using the tsc as part of a larger ABI works reliably. This rdtscp proposal is basically the latter, as a variant of the pvclock algorithm. I'm mostly interested in it as an implementation for vsyscall etc, rather than something that apps would use directly. > Unless... wait... are you assuming that > every unscheduled period results in an adjustment > of the pvclock offset parameter? That results in > "nanoseconds since guest boot during which any > vcpu is running" rather than "nanoseconds since > guest boot even when all vcpus are idle", right? > That's different than what I had in mind, but I > suppose it works. > Not following you here. J _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.