[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] rdtscP and xen (and maybe the app-tsc answer I've been looking for)
On 09/21/09 17:11, Dan Magenheimer wrote: >>> Yes, I neglected an important pre-condition. ASSUME the first >>> rdtscp on pcpu-A gets a version mismatch so that it must fetch >>> the parameters again. Then: the vcpu switches pcpu TWICE >>> from pcpu-A to pcpu-B and back to pcpu-A and does rdtscp >>> each time on pcpu-A but reads one or more pvclock parameters >>> (that are too big to be encoded in TSC_AUX) on pcpu-B. >>> >>> I agree that this is vanishingly low probability but on >>> a pcpu-oversubscribed machine I think it only takes one >>> vcpu-to-pcpu reschedule and then a poorly timed interrupt that >>> causes the vcpu to be unscheduled, and then later rescheduled >>> on the original processor. >>> >>> >> Sure. It just has to keep iterating until it gets consistency. If it >> iterates too long (10 times? 100? 1000?) it should give up and assume >> something is inherently broken. >> > No, I'm not talking about iteration. In the scenario I'm > trying to describe, the version number hasn't changed on > pcpu-A so the algorithm doesn't iterate. > Well, not "change" so much as "not updated". If the program keeps doing a rdtsc which shows that its local copy of the parameters is out of date, but its attempts to get up-to-date parameters keeps failing (because it keeps migrating cpus), then it will keep iterating without converging. Specifically, the algorithm would be: u64 tsc, time_ns; u32 aux; unsigned int version, cpu; again: rdtscp(&tsc, &aux); cpu = aux >> 24; /* physical cpu */ version = aux & ((1 << 24) - 1); /* At this point tsc and cpu+version are all fetched atomically and consistent, so context switch doesn't matter here; apply_fixup is not dependent on currently executing cpu. */ /* note that this prob. needs some local synchronization if the usermode program is multithreaded... */ if (unlikely(version != pvclockinfo[cpu].version)) { struct pvclock info; int curcpu; /* again, physical cpu */ /* Always fetches current cpu parameters, and tells us which cpu it is for. If we switched cpus since the rdtscp we won't end up updating the out-of-date info we detected but that doesn't matter because... */ curcpu = get_new_pvclock_info(&info); pvclockinfo[curcpu] = info; /* ...we repeat assuming that we're almost certainly still on the same cpu when we do rdtscp again */ goto again; } time_ns = apply_fixup(tsc, &pvclockinfo[cpu]); get_new_pvclock_info() can either be a syscall, hypercall or some other mechanism which can get a good atomic snapshot of the params along with cpu number from a shared memory region. > I realized after I sent this that I'm not really sure > I understand the pvclock implementation, particularly > under what circumstances the version number changes > or doesn't. And if this is different in any way > than the versions you are proposing that the app > would see. So I'm not positive we are considering > the same cases. > The pvclock algorithm only changes the version if the either the tsc offset or scale have changed. In the standard pvclock algorithm, a vcpu sees its own pvclock version change if either the pcpu undergoes some change which affects the tsc, *or* if the vcpu gets scheduled on a new pcpu (which could have different offset/scale). In the case we're talking about above, the code isn't pinned to a particular pcpu or vcpu (as it is usermode code with no real control over the kernel or xen schedulers), so it has to cope with preempt at any point. That's simplified by having the tsc and metadata fetch atomic, so it can revalidate its parameters every time it fetches the tsc. In that case, Xen need only update its internal version numbers when there's an actual change to the tsc's offset/scale without regard to vcpu scheduling. (And of course if the offset/scale end up being constant, then it will never need to update the offset, and usermode will only ever end up fetching it once per cpu.) J _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |