|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v11] tolerate jitter in cpu_khz calculation to avoid TSC emulation
>>> On 13.12.18 at 09:18, <olaf@xxxxxxxxx> wrote:
> Am Wed, 12 Dec 2018 09:39:25 -0700
> schrieb "Jan Beulich" <JBeulich@xxxxxxxx>:
>
>> >>> On 12.12.18 at 16:20, <olaf@xxxxxxxxx> wrote:
>> > If a domU uses TSC as clocksoure it also must run NTP in some way to
>> > avoid the potential drift what will most likely happen, independent of
>> > any migration.
>> Which drift? While anyone's well advised to run NTP, a completely
>> isolated set of systems may have no need to, if their interactions don't
>> depend on exactly matching time.
>
> If these hosts do not sync time to some reference host, the advancing of time
> is undefined before and after my change.
I'm lost. I simply don't understand what you're trying to tell me,
or how your answer relates to my question.
>> > The calculation of the drift is based on the time
>> > returned by remote servers versus how fast the local clock advances. NTP
>> > can handle a drift up to 500PPM. This means the local clocksource can
>> > run up to 500us slower or faster. This calculation is based on the TSC
>> > frequency of the host where the domU was started. Once a domU is
>> > migrated to a host of a different class, like from "2.3GHz" to "2.4GHz",
>> > the TSC frequency changes, but the domU kernel may not recalibrate
>> > itself.
>>
>> That's why we switch to emulated (or hardware scaling) mode in that
>> case. It's anyway not really clear to me what this entire ...
>>
>> > As a result, the drift will be larger and might be outside of
>> > the 500 PPM range. In addition, the kernel may notice the change of
>> > speed in which the TSC advances and could change the clocksource. All
>> > this depends of course on the type of OS that is running in the domU.
>>
>> ... (up to here) paragraph is supposed to tell the reader.
>>
>> > @@ -1885,6 +1888,16 @@ void __init early_time_init(void)
>> > printk("Detected %lu.%03lu MHz processor.\n",
>> > cpu_khz / 1000, cpu_khz % 1000);
>> >
>> > + tmp = 1000 * 1000;
>> > + tmp += VTSC_NTP_PPM_TOLERANCE;
>> > + tmp *= cpu_khz;
>> > + tmp /= 1000 * 1000;
>> > + tmp -= cpu_khz;
>> > + if (tmp >= VTSC_JITTER_RANGE_KHZ)
>> > + tmp -= VTSC_JITTER_RANGE_KHZ;
>>
>> Besides the style issue in the if() - how can this be correct? This
>> clearly introduces a discontinuity (just consider the case where
>> tmp is exactly VTSC_JITTER_RANGE_KHZ before the if()). And
>> I also can't see how it guarantees the resulting value to be
>> below VTSC_JITTER_RANGE_KHZ. Did you perhaps mean to
>> cap the value (i.e. = instead of -= )?
>
> This is supposed to make sure the value of Hz for 500us is always larger
> than the assumed jitter. So for a 2GHz host, with a theoretical tolerance
> of 1000, the actual tolerance is set to 800. tmp will be larger than the
> assumed jitter for cpu_khz > 400.
Again you don't appear to answer my question regarding the
discontinuity you introduce.
>> > + vtsc_tolerance_khz = (unsigned int)tmp;
>> Stray cast.
>
> Copy&paste from the cpu_khz assignment, perhaps a remainder from i386
> support?
Copy-and-paste is an explanation but not an excuse. Style
violations should not be cloned and thus furthered.
>> > + printk("Tolerating vtsc jitter for domUs: %u kHz.\n",
> vtsc_tolerance_khz);
>> Please omit the full stop; the printk() in context above shouldn't
>> have one either.
>
> You mean the trailing dot, or what means "full stop" in this context?
Yes, "full stop" means the final period in a sentence.
>> > + disable_vtsc = khz_diff <= vtsc_tolerance_khz;
>> > +
>> > + printk(XENLOG_G_INFO "d%d: host has %lu kHz,"
>> > + " domU expects %u kHz,"
>> > + " difference of %ld is %s tolerance of %u\n",
>> > + d->domain_id, cpu_khz, gtsc_khz, khz_diff,
>> > + disable_vtsc ? "within" : "outside",
>> > + vtsc_tolerance_khz);
>> > + }
>> > +
>> > if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() &&
>> > - (d->arch.tsc_khz == cpu_khz ||
>> > + (disable_vtsc ||
>> > (is_hvm_domain(d) &&
>> > hvm_get_tsc_scaling_ratio(d->arch.tsc_khz))) )
>> > {
>>
>> In any event I don't follow why all of the sudden this becomes
>> an always-on mode, with not even a boot command line override.
>
> Perhaps I failed to explain why there is no need to make this a knob.
>
> If a domU uses TSC as its timesource, and if it also uses NTP to make
> sure the time advances correctly, then this change will make sure the
> advancing of time will be withing the bounds that NTP can correct.
> If it does use TSC, but does not use NTP then the advancing will be
> undefined before and after my change.
As per above - I'm lost. I simply don't understand. All I notice is that
you talk about one specific use case of the TSC in a guest, without
(apparently) considering uses for other than what Linux calls its
clocksource. I in particular don't understand how anything can be
"undefined" here.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |