Xen project Mailing List

Re: [PATCH] x86/time: further improve TSC / CPU freq calibration accuracy

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Wed, 19 Jan 2022 09:46:26 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JeLSh4Yg6b3fmZr3jFmOSIw8LngkfrgTdBlog1LrqeM=; b=JrbkuqnmMX/BtHBdHF0T5QMmVCdkuAdEB585WeOCK97wfzojo50qEBLi2hj8u/xUfv7k3I7svYrzpyYbrK9JVB3Cc34CE+SfnYEMdXwneuWdAJb13gHBqg86F4ViUy+UIHwUPrdqWLoWElescXB2yxsxfFhRqoZXZaxy5/i0qnBPCW9F9/IszWIqBQ/RNs6Muf+XOg1+nbHZNxJVGOE8lv1LGhoebhGnNi6XbM1E20NPj9pbJ/nrY8E/h+owHnsdHxZZbvO5iolH/owkLLsUpn493/Bam8w5EtQRK2LdJ4sXc0ANBcubvhKpea4xnxeP7UWrdfbdz7Qx6BzCrYa8kA==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IWdBsWd8fmZov+220hplsvyJSLqQrkvQVuYj5zWnCcQsvOHgiJUou4Rv2/WE6bQvJzvcOQh6FGVbg33D9oWrp0drnDoqH14gB1ySwSsExjU/Ce/thfHfkRnJ6jy/ZLZhA/mQelA9vzFC/rrEVqhnQbKjZQvTmIJLPJkSUR/wVwt90tJVEFQBTRp4CAYyE8WLPIIauFg/b0Ja1tdqqMjIEDCd67lNPVv/UTgUBBaJUZLBH7X+bBHZL6weRsS9Z3ltv04k8Yne8RYhAPNGFWMQa9zMkKnZl0jpqw4qtU4bTdtDMivVCkUZ4bOqrWhL9wvNPdtCgGKDljSf1qirbfp+nw==

Authentication-results: esa2.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>

Delivery-date: Wed, 19 Jan 2022 08:46:42 +0000

Ironport-data: A9a23:1BF9zKrXnqQrtQDKUwBK+tO4AXBeBmL5YhIvgKrLsJaIsI4StFCzt garIBmFM/iOMDDyedAiPIq0oUICsMCAx4M1TlA/qy83EX8S8JuZCYyVIHmrMnLJJKUvbq7GA +byyDXkBJppJpMJjk71atANlZT4vE2xbuKU5NTsY0idfic5Dndx4f5fs7Rh2NQw2IHoW1rlV e7a+KUzBnf0g1aYDUpMg06zgEsHUCPa4W5wUvQWPJinjXeG/5UnJMt3yZKZdhMUdrJ8DO+iL 9sv+Znilo/vE7XBPfv++lrzWhVirrc/pmFigFIOM0SpqkAqSiDfTs/XnRfTAKtao2zhojx/9 DlCnZWreyMRDK3eousiYT9hFQNUOIYb6rCSdBBTseTLp6HHW37lwvEoB0AqJ4wIvO1wBAmi9 9RBdmpLNErawbvrnvTrEYGAhex6RCXvFJkYtXx6iynQEN4tQIzZQrWM7thdtNs1rp4UR6eCO ZtGAdZpRBjAPidFJVtOM4AZgNWMm2e8LzAEsWvA8MLb5ECMlVcsgdABKuH9ZdiiVchT2EGCq Qru72n/Rx0XKtGb4T6E6W63wP/CmzvhX4AfH6H+8eRl6HWRzGEODBwdVXOgvOK0zEW5Xrpix 1c8o3R06/JorQryE4e7D0bQTGO4UgA0UsFiN9UI6BO3xZXmxT+fJEIBZWNTZ4lz3CMpfgAC2 liMltLvIDVgtryJVH6QnoupQSOO1Ts9djFbO3JdJecRy5y6+dxo0EqTJjp2OPft1oWdJN3m/ 9ydQMHSbZ03hNVD6ai09Euvb9mE9smQFV5dCuk6swuYAuJFiGyNO93ABbvzt68owGOlor+p5 ihsdy+2trFmMH11vHbRKNjh5Znwjxp/DBXSgER0A74q/Cm39niocOh4uW8ifhoya5peJmGzO Sc/XD+9ArcJbBNGioctMuqM5zkCl/C8RbwJqNiJBjaxXnSBXFDep3w/DaJh92vsjFItgckC1 WSzKq6R4YIhIf0/llKeHr5FuZdyn3xW7T6NGfjTkkr2uZLDNC/9YepUazOmM7FmhJ5oVS2Iq b6zwePQlUUGOAA/CwGKmbMuwacidilkVcuo+p0OJoZu4GNOQQkcNhMY+pt4E6RNlKVJjObYu Ha7X05T0l3kgnPbbw6NbxhehHnHB/6TdFo3Yn4hO0iGwX8mbdr95asTbcJvL7Im6PZi3bh/S PxcI5eMBfFGSzLm/TUBbMai8Nw+JUrz3Q/ebTC4ZDUffoJ7Q1Cb8NHTYQaypjIFCTC6tJVir uT4hB/bW5cKWy9rEN3SNKC011q0sHVEwLByUkLEL8N9YkLp9IQ2eSX9guVue5MHKAnZxyvc3 AGTWE9Kqe7Iqo4z0d/ImaHb8Nv5T7ogRhJXRjCJ46y3OC/W+nuY7bVBCOvYLyrAUG7U+bm5Y bkHxf/LL/Bazk1BtJBxEug3wPtmtcfvvbJT0i9tAG7PMwawErplL3SLgZtPu6lKyuMLsAe6Q BvSqNxTOLHPM8L5Cl8BYgEia73bh/0TnzDT69UzIVn7u3ArrObWDx0KMknekjFZIZt0LJghk LUot8Mh4gCijgYnb4SdhSdO+mXQdnENXs3LbH3B7FMHXubz9mx/XA==

Ironport-hdrordr: A9a23:4bzBvKuTQm1v2MY0/aAMq1G07skC7oMji2hC6mlwRA09TyXGra +TdaUguSMc1gx9ZJhBo7G90KnpewK6yXdQ2/hqAV7EZniahILIFvAY0WKG+VPd8kLFh4xgPM tbAs1D4ZjLfCRHZKXBkXiF+rQbsaC6GcmT7I+0pRcdLj2CKZsQlzuRYjzrbHGeLzM2Y6bReq Dsgvau8FGbCAsqh4mAdzI4dtmGg+eOuIPtYBYACRJiwA6SjQmw4Lq/NxSDxB8RXx5G3L9nqA H+4kHEz5Tml8v+5g7X1mfV4ZgTsNz9yuFbDMjJrsQOMD3jhiuheYwkcbyfuzIepv2p9T8R4Z PxiiZlG/42x2Laf2mzrxeo8w780Aw243un8lOciWuLm72OeBsKT+56wa5JeBrQ7EQt+Ptm1r hQ4m6fv51LSTvdgSXU/bHzJl9Xv3vxhUBnvf8YjnRZX4dbQqRWt5Yj8ERcF4pFND7m6bogDP JlAKjnlblrmGuhHjDkV1RUsZ+RtixZJGbFfqFCgL3Y79FupgE586NCr/Zv20vp9/oGOu55Dq r/Q+BVfYp1P7wrhJRGdZM8qPuMexzwqC33QRCvyHTcZeg60iH22tbKCItc3pDeRHVP9up0pK j8

Ironport-sdr: 1zgF++htwVn4p8NMbXFpxH/6FLFHfETlW3zL4sbV9ffkDY8WddiBR7UqYdjWR7aoRqzYX5gCV/ zEn0ZedqjE7bSFN6dxbVLFwEQ1jpj7usOAJfBhnX2pSVYpZ10juRKsA15S3IoLYZtoy+7LqKAG jvWnZgjPCeqvGfzVpDxn+rNkaLagQ4ReKzr+8i0fmUe9hjEhZcv2nVwhlF9nt3ncgWIQcTj1i+ CDMquik7meD7KaGsxpWu0ptn7jvAtGZdYOegbgpYw8i8kNejmP7YKZctK7usj0C/ST5rvbV4ow PORbKSUsIc79CiI4Z+XzCWQZ

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Jan 18, 2022 at 03:26:36PM +0100, Jan Beulich wrote: > On 18.01.2022 15:05, Roger Pau Monné wrote: > > On Tue, Jan 18, 2022 at 02:39:03PM +0100, Jan Beulich wrote: > >> On 18.01.2022 13:45, Roger Pau Monné wrote: > >>> On Thu, Jan 13, 2022 at 02:41:31PM +0100, Jan Beulich wrote: > >>>> Calibration logic assumes that the platform timer (HPET or ACPI PM > >>>> timer) and the TSC are read at about the same time. This assumption may > >>>> not hold when a long latency event (e.g. SMI or NMI) occurs between the > >>>> two reads. Reduce the risk of reading uncorrelated values by doing at > >>>> least four pairs of reads, using the tuple where the delta between the > >>>> enclosing TSC reads was smallest. From the fourth iteration onwards bail > >>>> if the new TSC delta isn't better (smaller) than the best earlier one. > >>> > >>> Do you have some measurements as to whether this makes the frequency > >>> detection any more accurate? > >> > >> In the normal case (no SMI or alike) I haven't observed any further > >> improvement on top of the earlier patch. > >> > >>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > >>>> --- > >>>> Obviously (I think) instead of having both read_{hpet,pmtmr}_and_tsc() > >>>> the calibration logic could be folded between HPET and PMTMR, at the > >>> > >>> As an intermediate solution you could have a read_counter_and_tsc I > >>> would think? > >> > >> Not sure in which way you view this as "intermediate". > > > > As in not unifying both calibration logic into a single function, but > > rather just use a generic function to read the counter and TSC. > > > > With your original remark I assumed that you wanted to unify all the > > calibration code in init_hpet and init_pmtimer into a generic > > function, but maybe I've misunderstood. > > Oh, I see. I have to admit that I see little value (at this point) to > break out just the pair-read logic. While I did say we have similar > issues elsewhere, my initial analysis has left me with the impression > that those other cases are sufficiently different for such a helper to > be of no use there. > > >>>> expense of a couple more indirect calls (which may not be that much of a > >>>> problem as this is all boot-time only). Really such folding would have > >>>> been possible already before, just that the amount of duplicate code > >>>> hasn't been this large so far. IOW if so desired, then perhaps the > >>>> folding should be a separate prereq patch. > >>> > >>> You could even pass a platform_timesource as a parameter to that > >>> generic read counter and TSC helper. > >> > >> This was the plan, yes, in case I was asked to follow that route (or > >> if I decided myself that I'd prefer that significantly over the > >> redundancy). > > > > I think having a generic read_counter_and_tsc would be better than > > having read_{hpet,pmtmr}_and_tsc. > > > >> > >>>> --- a/xen/arch/x86/time.c > >>>> +++ b/xen/arch/x86/time.c > >>>> @@ -392,9 +392,36 @@ static u64 read_hpet_count(void) > >>>> return hpet_read32(HPET_COUNTER); > >>>> } > >>>> > >>>> +static uint32_t __init read_hpet_and_tsc(uint64_t *tsc) > >>>> +{ > >>>> + uint64_t tsc_prev = *tsc = rdtsc_ordered(), tsc_min = ~0; > >>>> + uint32_t best = best; > >>>> + unsigned int i; > >>>> + > >>>> + for ( i = 0; ; ++i ) > >>>> + { > >>>> + uint32_t hpet = hpet_read32(HPET_COUNTER); > >>>> + uint64_t tsc_cur = rdtsc_ordered(); > >>>> + uint64_t tsc_delta = tsc_cur - tsc_prev; > >>>> + > >>>> + if ( tsc_delta < tsc_min ) > >>>> + { > >>>> + tsc_min = tsc_delta; > >>>> + *tsc = tsc_cur; > >>>> + best = hpet; > >>>> + } > >>>> + else if ( i > 2 ) > >>>> + break; > >>>> + > >>>> + tsc_prev = tsc_cur; > >>> > >>> Would it be better to set tsc_prev to the current TSC value here in > >>> order to minimize the window you are measuring? Or else tsc_delta will > >>> also account for the time in the loop code, and there's more > >>> likeliness from being interrupted? > >> > >> I did consider this (or some variant thereof), but did for the moment > >> conclude that it wouldn't make enough of a difference. If you think > >> it would be meaningfully better that way, I'll switch. > >> > >> Both here and for the aspect above, please express you preference (i.e. > >> not in the form of a question). > > > > Let's leave it as-is (just one TSC read per loop iteration). > > > >>> I guess being interrupted four times so that all loops are bad is very > >>> unlikely. > >>> > >>> Since this loop could in theory run multiple times, do we need to > >>> check that the counter doesn't overflow? Or else the elapsed counter > >>> value would be miscalculated. > >> > >> I don't think I see any issue with overflow here. It's the callers > >> who need to care about that. And I don't think this function will run > >> for long enough such that a counter would not only wrap, but also > >> reach its initial count again. > > > > Right, I haven't expressed myself correctly. It's not overflowing the > > issue, but rather wrapping to the start count value. Limiting the > > iterations to some fixed value (10?) in order to remove that > > possibility completely would be OK for me. > > Hmm, I was in fact hoping (and trying) to get away without any arbitrarily > chosen numbers. The loop cannot be infinite in any event ... Please let me > know how strong you feel about putting in place such an arbitrary limit. It was mostly for safety reasons, I wouldn't expect that loop to need more than 4 iterations really (which is also an arbitrary chosen number). Let's leave it without any limit then for the time being. Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.