[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1/1] x86/arch: VM resume: avoid RDTSC emulation due to host clock drift
On 02.09.2019 20:27, Edwin Török wrote: > On a Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz the host frequency drifts: > ``` > (XEN) [ 6.607693] Detected 2600.004 MHz processor. > (XEN) [ 2674.213081] dom1(hvm): > mode=0,ofs=0xfffee6f70b7faa48,khz=2600018,inc=3 > (XEN) [ 2674.213087] dom2(hvm): > mode=0,ofs=0xfffee6fd499835c0,khz=2600018,inc=2 > ``` > > The 2 domains were suspended prior to rebooting the host and applying a > xen/microcode patch. After the reboot the frequency of the host was deemed to > be slightly different, and therefore switching on RDTSC emulation for the > Linux > HVM guest, even though the difference was only 5 ppm. This CPU doesn't support > TSC scaling. > > Therefore we should either measure the standard deviation of our calibration > and have a range of acceptable frequencies as "same", or have a static > tolerance value. > > The platform timer's clock frequency accuracy is: > * IA-PC HPET Specification 1.0a sections 2.2 and 2.4.1: 500 ppm or better > * ACPI PM timer, and PIT timer do not have defined accuracies > * Intel 300 Series datasheet section 25.6: 24 MHz crystal 100 ppm or better > * NTP FAQ section 3.3 Clock Quality: 11 ppm drift due to temperature > * section 2.2.2 claims that PIT/ACPI PM timer share the same crystal as HPET > and > thus 500 ppm as an upper bound, "the real drift is usually smaller than 30ppm" > > For simplicity and determinism opted for a static tolerance value of 100 ppm > here, such that the any error would be well within the error you would get > with > HPET/Linux's calibration. NTP can cope with a drift < 500 ppm. > Most importantly this should stop Xen from claiming that the clock frequency > on > the same host is different across reboots. Specifications do not currently > mandate an accuracy higher than 100 ppm, therefore OSes should already be able > to cope with such drift on real hardware. Any improvements in accuracy from > future specifications/motherboards wouldn't be applicable, because they would > also come with newer CPUs that support TSC scaling. > > If the CPU does support TSC scaling Xen will of course still attempt to match > the exact frequency value it thinks the guest had when it was suspended. > See below for `if ( hvm_tsc_scaling_supported && !d->arch.vtsc )` (not visible > in patch context). > > llabs() doesn't appear to be available when building xen, hence the 2 > comparisons. > > After this patch when suspending a VM, and rebooting the host I get this > output: > ``` > (XEN) [ 6.614703] Detected 2600.010 MHz processor. > (XEN) [ 138.924342] TSC marked as reliable, warp = 0 (count=2) > (XEN) [ 138.924346] dom1(hvm): > mode=0,ofs=0xfffed01901016d18,khz=2600012,inc=2 > ``` > > Signed-off-by: Edwin Török <edvin.torok@xxxxxxxxxx> First of all - are you aware that there had been multiple iterations of a patch (by Olaf Hering) making this a command line and/or guest config controllable setting? If so, it would have been nice if at least the cover letter identified the correlation. If not, please take a look. After all it hasn't gone in so far because of objections by Andrew. Using a hardcoded tolerance value in any event raises the question of how you know whether a particular guest OS can actually cope. > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -2171,6 +2171,12 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode, > *elapsed_nsec = 0; > } > > +static inline int frequency_same_with_tolerance(int64_t khz1, int64_t khz2) The return type wants to be bool. And there wants to be an explaining comment ahead of the function, (re-)stating some of what you have in the description. > +{ > + int64_t ppm = (khz2 - khz1) * 1000000 / khz2; > + return -100 < ppm && ppm < 100; While we have no llabs(), you should imo use either ABS() or __builtin_labs() / __builtin_llabs(). Furthermore, could we limit this behavior to the case of there not being TSC scaling available (due to running on old hardware, or due to it being a PV guest)? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |