[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC] x86/time: avoid early uses of NOW() to return zero


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 6 May 2026 12:38:11 +0200
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=google header.d=suse.com header.i="@suse.com" header.h="Content-Transfer-Encoding:In-Reply-To:Autocrypt:From:Content-Language:References:Cc:To:Subject:User-Agent:MIME-Version:Date:Message-ID"
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Teddy Astie <teddy.astie@xxxxxxxxxx>
  • Delivery-date: Wed, 06 May 2026 10:38:20 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 06.05.2026 12:11, Roger Pau Monné wrote:
> On Wed, May 06, 2026 at 11:37:41AM +0200, Jan Beulich wrote:
>> RFC: This breaks at least the TSM_BOOT case printk_start_of_line(), which
>>      checks for NOW() returning 0 (falling back to TSM_RAW in this case).
>>      For now I have no idea how to avoid this, except that when CPUID leaf
>>      0x15 is available we could leverage that to put in place at least an
>>      approximate scale value. Doing so could, however, lead to a
>>      discontinuity (returned value moving backwards) once the final scale
>>      value was put in place. (Note, however, that such a discontinuity can
>>      also result from init_percpu_time() using the BSP's scale value as
>>      initial estimate for APs. Then again local_time_calibration() at
>>      least makes an attempt at avoiding such.)
> 
> For the purposes of printk_start_of_line() we could unconditionally
> use get_cycles() when system_state < SYS_STATE_smp_boot IMO.

Hmm, "raw" console timestamps are quite a bit uglier to deal with as a
human. Also, while init_xen_time() is pretty close to us setting
SYS_STATE_smp_boot, early_time_init() occurs earlier (and with
init_percpu_time() also called from there that's enough for "good"
timestamps).

>  Using
> the frequency value from CPUID seems like a good approach also on
> boxes that expose this information.

As per what you suggest below, we may then need to increase that value
by some margin, to have NOW() rather move a little to slow than too
fast. Plus of course it won't help for AMD at all.

> I wonder, we seem to unconditionally perform the TSC calibration
> against a known frequency time source, wouldn't it be more reliable to
> use the information from leaf 0x15 when available?

Andrew has been suggesting this, but I can only keep saying that what
CPUID reports are nominal values aiui, not actual ones. From what I
know, there's always some (small) variation as to the frequency of
actual crystals. And it's unclear whether our calibration is more
precise than what CPUID tells us. (If we knew at least average errors,
we could maybe calculate the value to use from both the calculated and
the nominal value.)

>> RFC: While generally the mentioned waiting loops will take longer to time
>>      out, on a very fast CPU tight loops may time out too early.
> 
> I was wondering about that, increasing just a nano-second for each
> call seems like it's going to make progress fairly slow?  Obviously
> depends on how tights the calls to NOW() are in the outside loop.
> 
> Maybe when lacking frequency information from CPUID we could assume
> something like 8GHz and scale the TSC based on that?  AFAICT it's
> advisable to use a frequency greater than any CPU, as then we don't
> risk NOW() running too fast.

Whatever value we pick, something faster may later appear. And too high
a value isn't good either.

>> RFC: In get_s_time_fixed(), should we perhaps assert that the scale was
>>      set?
> 
> Might be good, but I would like to see what explodes when doing
> that...

Of course that would need checking first. I've audited the callers, and
all looked safe to me. Will do for v2.

>> I don't think Fixes: tags should be put here. If we did, we'd have to
>> enumerate all introductions of early uses of NOW() (or get_s_time()), with
>> the exception of those dealing with getting back 0 (which I expect is only
>> printk_start_of_line()).
> 
> I'm fine with no fixes tag, but we need to remember to backport this
> one.

Definitely.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.