[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen domU Timekeeping (a.k.a TSC/HPET issues)


  • To: "Paul Durrant" <Paul.Durrant@xxxxxxxxxx>
  • From: "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>
  • Date: Fri, 17 Feb 2012 11:06:19 -0800
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Qrux <qrux.qed@xxxxxxxxx>
  • Delivery-date: Fri, 17 Feb 2012 19:06:59 +0000
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=message-id :in-reply-to:references:date:subject:from:to:cc:reply-to :mime-version:content-type:content-transfer-encoding; q=dns; s= lagarcavilla.org; b=ert01K/uBwuh8eZofBaBiam0QYgpvd6DPYiyc1btAcVW UwmIMTfLN82udZqEkr6FU3uxzUXdRFy8DbQiA5dqR2/9Ljq8ZOG1Bt4qcRgPjNKI in9WVSPg7topSGWCYNWEMCdLLnQrTgbi+njlP/EQaJXN5J8t2z631iyORtrkj1U=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

> IIRC Windows (7) calibrates tsc against RTC at start of day and you can
> get some pretty odd results if a vcpu is descheduled during that
> calculation. I was not aware of the exact scenario you described. You
> *can* get the equivalent of the /usepmtimer boot.ini option on more recent
> OS by doing:
>
> bcdedit /set useplatformclock true
Hey Paul,
we've tried that, no joy.

>
> Enabling viridian timers in Xen may be the way to go though.
I'm looking at xen code (and various XenServer versions as well). I don't
see any timer implementations, at least not anything from the msdn
references (e.g. HV_X64_MSR_REFERENCE_TSC, HV_X64_MSR_STIMER0_CONFIG, etc)

Am I looking in the wrong place?
Thanks
Andres

>
>   Paul
>
>> -----Original Message-----
>> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-devel-
>> bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Andres Lagar-Cavilla
>> Sent: 17 February 2012 16:28
>> To: xen-devel@xxxxxxxxxxxxxxxxxxx
>> Cc: Ian Campbell; Qrux
>> Subject: Re: [Xen-devel] Xen domU Timekeeping (a.k.a TSC/HPET issues)
>>
>> > Date: Fri, 17 Feb 2012 12:06:05 +0000
>> > From: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
>> > To: Qrux <qrux.qed@xxxxxxxxx>
>> > Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
>> > Subject: Re: [Xen-devel] Xen domU Timekeeping (a.k.a TSC/HPET issues)
>> > Message-ID: <1329480365.3131.50.camel@xxxxxxxxxxxxxxxxxxxxxx>
>> > Content-Type: text/plain; charset="UTF-8"
>> >
>> > I'm afraid I don't know the answer to most of your questions (hence
>> > I'm afraid I've trimmed the quotes rather aggressively) but here's
>> > some of what I do know.
>>
>> I'm gonna add another data point.
>>
>> We're seeing the Windows 7 Query Performance Counter get mightily
>> confused. People have reported this in Amazon EC2 as well
>> https://forums.aws.amazon.com/thread.jspa?threadID=41426
>>
>> We've tracked it down to the hpet. Xen schedules an interrupt delivery
>> for
>> an hpet tick, but the vcpu is asleep. Could be an admin pause, a sleep
>> on a
>> wait queue, paused while qemu does its thing, paused while a mem event
>> is
>> processed...
>>
>> When the vcpu wakes up it receives the "late" hpet tick. We believe
>> Windows 7 QPC also reads the TSC at that point. The TSC kept on ticking
>> while the vcpu was paused. Windows does not know what to do about the
>> discrepancy and reports a large time leap, usually consistent with a
>> full
>> round trip of the 32 bit hpet counter at the Xen-emulated 1/16GHz
>> frequency.
>>
>> MSDN forums blame "bad hardware" for this. Funny.
>>
>> So, we could solve our particular problem if the tsc were not to tick
>> during
>> vcpu sleep. And I get an inkling that would help with this post as well.
>> But I
>> don't think any of the advertised timer or tsc modes do that.
>>
>> Thanks,
>> Andres
>>
>>
>>
>> I'm not sure this will help with the original post, but there's gotta be
>> somebody who
>> >
>> >> But, practically, is there a safe CPU configuration?
>> >
>> > I think that part of the problem here is that it is very hard to
>> > determine this at the hardware level. There are at least 3 (if not
>> > more) CPUID feature bits which say "no really, the TSC is good and
>> > safe to use this time, you can rely on that" because they keep
>> > inventing new ways to get it wrong.
>> >
>> > [...]
>> >>
>> >> Since September, I can't find any further information about this
>> >> issue. What is the state of this issue?  The inconsistency I see
>> >> right now is this: in the July 2010 TSC discussion, a "Stefano
>> Stabellini"
>> >> posted this:
>> >>
>> >> ====
>> >> > /me wonders if timer_mode=1 is the default for xl?
>> >> > Or only for xm?
>> >>
>> >> no, it is not.
>> >> Xl defaults to 0 [zero], I am going to change it right now.
>> >> ====
>> >>
>> >> So, it seems like (at least as of July 2010), xl is defaulting to
>> >> "timer_mode=1".  That is, assuming that the then-current timer_mode
>> >> is the same as present-day tsc_mode.
>> >
>> > No, I believe they are different things.
>> >
>> > tsc_mode is to do with the TSC, emulation vs direct exposure etc. Per
>> > xen/include/asm-x86/time.h and (in recent xen-unstable) xl.cfg(5)
>> >
>> > timer_mode is to do with the the way that timer interrupts are
>> > injected into the guest. This is described in
>> xen/include/public/hvm/params.h.
>> > This isn't documented in xl.cfg(5) because I couldn't make head nor
>> > tail of the meaning of that header :-(
>> >
>> >>   In addition, I'm assuming he was changing it from 0 (zero) to 1
>> >> (one)--and not some other mode.  But,
>> >>
>> >>         xen-4.1.2/docs/misc/tscmode.txt
>> >
>> > Remember that he was referring to timer_mode not tsc_mode...
>> >
>> >> says:
>> >>
>> >>         "The default mode (tsc_mode==0) checks TSC-safeness of the
>> >> underlying
>> >>         hardware on which the virtual machine is launched.  If it is
>> >>         TSC-safe, rdtsc will execute at hardware speed; if it is not,
>> >> rdtsc
>> >>         will be emulated."
>> >>
>> >> Which implies the default is always 0 (zero).  Which is it?
>> >
>> > It seems that xl, in xen-unstable, defaults to:
>> >    timer_mode = 1
>> >    tsc_mode = 0
>> > as does 4.1 as far as I can tell via code inspection.
>> >
>> >> More importantly, is the solution to force tsc_mode=2?
>> >
>> > IMHO this is safe in most situations unless you are running some sort
>> > of workload (e.g. a well known database) which has stringent
>> > requirements regarding the TSC for transactional consistency (hence
>> > the conservative default).
>> >
>> >>   If so, under what BIOS/xen-boot-params/dom0-boot-params
>> conditions?
>> >> And--please excuse my exasperation--but WTH does this have to do with
>> >> ext3 versus ext4?  Is ext4 exquisitely sensitive to TSC/HPET
>> >> "jumpiness" (if that's even what's happening)?
>> >
>> > Sorry, I have no idea how/why the filesystem would be related to the
>> > TSC.
>> >
>> > It is possible you are actually seeing two bugs I suppose -- there
>> > have been issues relating to ext4 and barriers in some kernel versions
>> > (I'm afraid I don't recall the details, the list archives ought to
>> > contain something).
>> >
>> > Ian.
>> >
>> >
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
>



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.