|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 1/2] xen/x86: don't send IPI to sync TSC when it is reliable
On Thu, 10 Jul 2025, Jan Beulich wrote:
> On 08.07.2025 20:32, Stefano Stabellini wrote:
> > --- a/xen/arch/x86/time.c
> > +++ b/xen/arch/x86/time.c
> > @@ -2297,11 +2297,7 @@ static void cf_check time_calibration(void *unused)
> > };
> >
> > if ( clocksource_is_tsc() )
> > - {
> > - local_irq_disable();
> > - r.master_stime = read_platform_stime(&r.master_tsc_stamp);
> > - local_irq_enable();
> > - }
> > + return;
>
> Assuming the rendezvous can indeed be entirely skipped, I agree that there's
> no point calling read_platform_stime() here. Yet to yield a consistent
> result, more changes are then necessary imo:
> - as indicated before, the invocation of this function from
> verify_tsc_reliability() when plt_tsc was chosen is then entirely
> pointless,
> - time_calibration_nop_rendezvous() would then apparently want purging, not
> the least to make clear that TIME_CALIBRATE_SOFTIRQ is never raised in
> this mode (one of your goals after all, aiui),
Good suggestions.
> - the function being a timer handler, it would be preferable if the timer
> wasn't ever activated in this mode (at which point rather than returning
> early, the code above could simply be purged, maybe replaced by e.g. an
> assertion),
I see your point about the timer not being activated in the first place.
But if we want to make the code more reliable we should keep the if
(clocksource_is_tsc()) return; in time_calibration. That way, in case of
mistakes elsewhere, still the desired behavior is obtained.
I'll add the changes to cpu_frequency_change and local_time_calibration.
I'll append an incremental patch to clarify my intent.
> - the above in particular requires dealing with cpu_frequency_change() (the
> other of the two places where the timer is actually activated).
>
> Some care may be needed in all of this taking into consideration that the
> platform timer change to TSC happens late. Albeit commit f954a1bf5f74
> ("x86/time: change initiation of the calibration timer") has imo eliminated
> the main concern here.
>
> As to skipping the rendezvous: Besides invoking the calibration softirq,
> time_calibration_nop_rendezvous() also updates the per-CPU cpu_calibration
> fields. There would thus need to be a pretty formal proof that calculations
> involving ->local_stime or ->local_tsc can't possibly degrade or even
> degenerate when they remain at their boot-time values. (As to
> ->master_stime, afaict the field simply isn't used at all in that mode,
> which is a fair part of the reason why the code change above is okay _if_
> the rendezvous itself can be eliminated. The justification for that could
> also do with extending some, considering that much of the involved code is
> pretty subtle.) Alternatively, if such a proof turned out impossible,
> another way of updating the fields every once in a while would need adding.
Do you mean a formal proof that the TSC is actually stable from a
hardware perspective? The software algorithm is the same no matter the
number of updates.
> Finally, what you do here isn't entirely reliable as to your apparent end
> goal: "clocksource=tsc" is respected only when tsc_check_reliability()
> completes with an acceptable outcome. There's certainly some variability in
> this across multiple runs, i.e. if things went extremely bad, once in blue
> moon you may end up with the TSC being rejected for use as platform timer.
That is interesting! One option is to change the code so that
clocksource=tsc is always respected. I have appended the change on top
of this patch. Please let me know if you have other suggestions.
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index d72e640f72..d29266086d 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1877,7 +1877,7 @@ int cpu_frequency_change(u64 freq)
update_vcpu_system_time(current);
/* A full epoch should pass before we check for deviation. */
- if ( smp_processor_id() == 0 )
+ if ( smp_processor_id() == 0 && !clocksource_is_tsc() )
{
set_timer(&calibration_timer, NOW() + EPOCH);
platform_time_calibration();
@@ -2024,7 +2024,7 @@ static void cf_check local_time_calibration(void)
update_vcpu_system_time(current);
out:
- if ( smp_processor_id() == 0 )
+ if ( smp_processor_id() == 0 && !clocksource_is_tsc() )
{
set_timer(&calibration_timer, NOW() + EPOCH);
platform_time_calibration();
@@ -2271,22 +2271,6 @@ static void cf_check
time_calibration_std_rendezvous(void *_r)
time_calibration_rendezvous_tail(r, 0, rdtsc_ordered());
}
-/*
- * Rendezvous function used when clocksource is TSC and
- * no CPU hotplug will be performed.
- */
-static void cf_check time_calibration_nop_rendezvous(void *rv)
-{
- const struct calibration_rendezvous *r = rv;
- struct cpu_time_stamp *c = &this_cpu(cpu_calibration);
-
- c->local_tsc = r->master_tsc_stamp;
- c->local_stime = r->master_stime;
- c->master_stime = r->master_stime;
-
- raise_softirq(TIME_CALIBRATE_SOFTIRQ);
-}
-
static void (*time_calibration_rendezvous_fn)(void *) =
time_calibration_std_rendezvous;
@@ -2488,7 +2472,7 @@ static int __init cf_check verify_tsc_reliability(void)
* CPUs are booted.
*/
tsc_check_reliability();
- if ( tsc_max_warp )
+ if ( tsc_max_warp && strcmp(opt_clocksource, "tsc") )
{
printk("TSC warp detected, disabling TSC_RELIABLE\n");
setup_clear_cpu_cap(X86_FEATURE_TSC_RELIABLE);
@@ -2506,21 +2490,12 @@ static int __init cf_check verify_tsc_reliability(void)
*/
on_selected_cpus(&cpu_online_map, reset_percpu_time, NULL, 1);
- /*
- * We won't do CPU Hotplug and TSC clocksource is being used which
- * means we have a reliable TSC, plus we don't sync with any other
- * clocksource so no need for rendezvous.
- */
- time_calibration_rendezvous_fn = time_calibration_nop_rendezvous;
-
/* Finish platform timer switch. */
try_platform_timer_tail();
printk("Switched to Platform timer %s TSC\n",
freq_string(plt_src.frequency));
- time_calibration(NULL);
-
return 0;
}
}
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |