[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 1/2] xen/x86: don't send IPI to sync TSC when it is reliable
On Thu, 10 Jul 2025, Jan Beulich wrote: > On 08.07.2025 20:32, Stefano Stabellini wrote: > > --- a/xen/arch/x86/time.c > > +++ b/xen/arch/x86/time.c > > @@ -2297,11 +2297,7 @@ static void cf_check time_calibration(void *unused) > > }; > > > > if ( clocksource_is_tsc() ) > > - { > > - local_irq_disable(); > > - r.master_stime = read_platform_stime(&r.master_tsc_stamp); > > - local_irq_enable(); > > - } > > + return; > > Assuming the rendezvous can indeed be entirely skipped, I agree that there's > no point calling read_platform_stime() here. Yet to yield a consistent > result, more changes are then necessary imo: > - as indicated before, the invocation of this function from > verify_tsc_reliability() when plt_tsc was chosen is then entirely > pointless, > - time_calibration_nop_rendezvous() would then apparently want purging, not > the least to make clear that TIME_CALIBRATE_SOFTIRQ is never raised in > this mode (one of your goals after all, aiui), Good suggestions. > - the function being a timer handler, it would be preferable if the timer > wasn't ever activated in this mode (at which point rather than returning > early, the code above could simply be purged, maybe replaced by e.g. an > assertion), I see your point about the timer not being activated in the first place. But if we want to make the code more reliable we should keep the if (clocksource_is_tsc()) return; in time_calibration. That way, in case of mistakes elsewhere, still the desired behavior is obtained. I'll add the changes to cpu_frequency_change and local_time_calibration. I'll append an incremental patch to clarify my intent. > - the above in particular requires dealing with cpu_frequency_change() (the > other of the two places where the timer is actually activated). > > Some care may be needed in all of this taking into consideration that the > platform timer change to TSC happens late. Albeit commit f954a1bf5f74 > ("x86/time: change initiation of the calibration timer") has imo eliminated > the main concern here. > > As to skipping the rendezvous: Besides invoking the calibration softirq, > time_calibration_nop_rendezvous() also updates the per-CPU cpu_calibration > fields. There would thus need to be a pretty formal proof that calculations > involving ->local_stime or ->local_tsc can't possibly degrade or even > degenerate when they remain at their boot-time values. (As to > ->master_stime, afaict the field simply isn't used at all in that mode, > which is a fair part of the reason why the code change above is okay _if_ > the rendezvous itself can be eliminated. The justification for that could > also do with extending some, considering that much of the involved code is > pretty subtle.) Alternatively, if such a proof turned out impossible, > another way of updating the fields every once in a while would need adding. Do you mean a formal proof that the TSC is actually stable from a hardware perspective? The software algorithm is the same no matter the number of updates. > Finally, what you do here isn't entirely reliable as to your apparent end > goal: "clocksource=tsc" is respected only when tsc_check_reliability() > completes with an acceptable outcome. There's certainly some variability in > this across multiple runs, i.e. if things went extremely bad, once in blue > moon you may end up with the TSC being rejected for use as platform timer. That is interesting! One option is to change the code so that clocksource=tsc is always respected. I have appended the change on top of this patch. Please let me know if you have other suggestions. diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index d72e640f72..d29266086d 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -1877,7 +1877,7 @@ int cpu_frequency_change(u64 freq) update_vcpu_system_time(current); /* A full epoch should pass before we check for deviation. */ - if ( smp_processor_id() == 0 ) + if ( smp_processor_id() == 0 && !clocksource_is_tsc() ) { set_timer(&calibration_timer, NOW() + EPOCH); platform_time_calibration(); @@ -2024,7 +2024,7 @@ static void cf_check local_time_calibration(void) update_vcpu_system_time(current); out: - if ( smp_processor_id() == 0 ) + if ( smp_processor_id() == 0 && !clocksource_is_tsc() ) { set_timer(&calibration_timer, NOW() + EPOCH); platform_time_calibration(); @@ -2271,22 +2271,6 @@ static void cf_check time_calibration_std_rendezvous(void *_r) time_calibration_rendezvous_tail(r, 0, rdtsc_ordered()); } -/* - * Rendezvous function used when clocksource is TSC and - * no CPU hotplug will be performed. - */ -static void cf_check time_calibration_nop_rendezvous(void *rv) -{ - const struct calibration_rendezvous *r = rv; - struct cpu_time_stamp *c = &this_cpu(cpu_calibration); - - c->local_tsc = r->master_tsc_stamp; - c->local_stime = r->master_stime; - c->master_stime = r->master_stime; - - raise_softirq(TIME_CALIBRATE_SOFTIRQ); -} - static void (*time_calibration_rendezvous_fn)(void *) = time_calibration_std_rendezvous; @@ -2488,7 +2472,7 @@ static int __init cf_check verify_tsc_reliability(void) * CPUs are booted. */ tsc_check_reliability(); - if ( tsc_max_warp ) + if ( tsc_max_warp && strcmp(opt_clocksource, "tsc") ) { printk("TSC warp detected, disabling TSC_RELIABLE\n"); setup_clear_cpu_cap(X86_FEATURE_TSC_RELIABLE); @@ -2506,21 +2490,12 @@ static int __init cf_check verify_tsc_reliability(void) */ on_selected_cpus(&cpu_online_map, reset_percpu_time, NULL, 1); - /* - * We won't do CPU Hotplug and TSC clocksource is being used which - * means we have a reliable TSC, plus we don't sync with any other - * clocksource so no need for rendezvous. - */ - time_calibration_rendezvous_fn = time_calibration_nop_rendezvous; - /* Finish platform timer switch. */ try_platform_timer_tail(); printk("Switched to Platform timer %s TSC\n", freq_string(plt_src.frequency)); - time_calibration(NULL); - return 0; } }
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |