[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen PIT timer




On 26 Sep 2005, at 16:22, Ryan Harper wrote:

Keir,

Thanks for the explanation.  I'm still trying to reason out why we are
seeing the 'Time went backwards' every now and then, as well as being able
to forcibly create the issue via serial interrupt floods (holding 'r'
with serial input sent to Xen).

Is either assumption, PIT_CH0 ~= 10ms (Linux) or PIT_CH2 = 1.119380Mhz
source (Xen) more valid?  It sounds like either is valid.  Though Linux
can be adjusted via ntpd, is there any correcting factor for Xen?  I
know we can run ntpd in dom0 and it can update the wall clock timer, but
AFAICT, wall clock doesn't affect system_timestamp (which is where we
detect the Time went backwards in
linux-2.6-xen-sparse/arch/x86/xen/i386/kernel/time.c).

NTP isn't plumbed thru into Xen yet. So ntpd will only affect the domain it is run in right now.

Via some trace observation, I've noticed that the per-cpu time in Xen
(specifically stime_local_stamp) can vary widely between cpus.  Is this
the best source to be using for updating Linux system_timestamp since it
can vary significantly ( >1000000 ) between processors?

It is supposed to be a trustworthy source. Given we resync every CPU every second, being out by 1ms would indicate either a bug in the resync code or local oscillators jittering by 1000ppm, which is hard to believe!

I haven't gotten around to doing it yet, but I was going to instrument
irq disable/enable to see how long we run with irq's disable with the
thought that we might be missing some events from which Xen derives time
calculations.  Is this a worthwhile investigation?

It would be interesting. Unless you are sync'ed to the PIT you should be able to go reasonably long periods with no timer interrupts with no ill effects (except the CPU time may get to wander off track a little more than it would otherwise have done). If you are sync'ed to the PIT (you have no cyclone, hpet or other chipset timer) then CPU0 needs to take a timer interrupt at least every 50ms or it will miss the 16-bit PIT counter wrapping.

Do you have any other suggestions on where I should investigate? I know
this isn't a problem for most folks, but we are still concerned that it
shows up every now and then on our platform under Xen, though we don't
see any of the Linux lost_tick stuff when running Linux.

Getting this stuff right seems a lot harder than it ought to be. There are clearly problems in the existing code -- if you've been able to ascertain that there are real sync issues between CPUs (ought by >1ms relative to each other) then that is a start. I would investigate how they manage to get so out of sync, when all oscillators in the system ought to be driven by crystals with stability much better than that.

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.