Xen project Mailing List

Re: [Xen-devel] [PATCH v2 3/6] x86/time: implement tsc as clocksource

From: Joao Martins <joao.m.martins@xxxxxxxxxx>

Date: Tue, 5 Apr 2016 15:56:15 +0100

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>, xen-devel@xxxxxxxxxxxxx

Delivery-date: Tue, 05 Apr 2016 14:56:28 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 04/05/2016 11:43 AM, Jan Beulich wrote: >>>> On 29.03.16 at 15:44, <joao.m.martins@xxxxxxxxxx> wrote: >> Introduce support for using TSC as platform time which is the highest >> resolution time and most performant to get (~20 nsecs). Though there >> are also several problems associated with its usage, and there isn't a >> complete (and architecturally defined) guarantee that all machines >> will provide reliable and monotonic TSC across all CPUs, on different >> sockets and different P/C states. I believe Intel to be the only that >> can guarantee that. For this reason it's only set when adminstrator >> changes "clocksource" boot option to "tsc". Initializing TSC >> clocksource requires all CPUs to have the tsc reliability checks >> performed. init_xen_time is called before all CPUs are up, and so we >> start with HPET at boot time, and switch later to TSC. > > Please don't make possibly wrong statements: There's no guarantee > we'd start with HPET - might as well be PMTMR or PIT. My apologies - while the commit message doesn't express this, I meant it to be "for example we would start with HPET (...)". I gave that example as it's the one preferable in plt_timers. > >> The switch then >> happens on verify_tsc_reliability initcall that is invoked when all >> CPUs are up. When attempting to initializing TSC we also check for >> time warps and appropriate CPU features i.e. TSC_RELIABLE, >> CONSTANT_TSC and NONSTOP_TSC. And in case none of these conditions are >> met, we keep the clocksource that was previously initialized on >> init_xen_time. > > DYM "And in case any of these conditions is not met, ..."? Yes. My use of "none" may be wrong here so just to be sure what I mean is: (...) If all of the conditions in the previous sentence are not met (...) > Or, > looking at the code, something more complex? > >> --- a/xen/arch/x86/time.c >> +++ b/xen/arch/x86/time.c >> @@ -432,6 +432,63 @@ uint64_t ns_to_acpi_pm_tick(uint64_t ns) >> } >> >> /************************************************************ >> + * PLATFORM TIMER 4: TSC >> + */ >> +static u64 tsc_freq; >> +static unsigned long tsc_max_warp; >> +static void tsc_check_reliability(void); >> + >> +static int __init init_tsctimer(struct platform_timesource *pts) >> +{ >> + bool_t tsc_reliable = 0; >> + >> + tsc_check_reliability(); >> + >> + if ( tsc_max_warp > 0 ) >> + { >> + tsc_reliable = 0; > > Redundant assignment. > Yes, Konrad had point that out too. Fixed that. >> +static void resume_tsctimer(struct platform_timesource *pts) >> +{ >> +} > > No need for an empty function - other clock sources don't have > such empty stubs either (i.e. the caller checks for NULL). > OK. >> @@ -541,6 +613,10 @@ static int __init try_platform_timer(struct >> platform_timesource *pts) >> if ( rc <= 0 ) >> return rc; >> >> + /* We have a platform timesource already so reset it */ >> + if ( plt_src.counter_bits != 0 ) >> + reset_platform_timer(); > > What if any of the time functions get used between here and the > setting of the new clock source? For example, what will happen to > log messages when "console_timestamps=..." is in effect? time functions will use NOW() (i.e. get_s_time) which uses cpu_time structs being updated on local_time_calibration() or cpu frequency changes. reset_platform_timer() will zero out some of the variables used by the plt_overflow and read_platform_stime(). The overflow and calibration isn't an issue as the timers are previously killed so any further calls will use what's on cpu_time while plt_src is being changed. The case I could see this being different is if cpu_frequency_change is called between reset_platform_time() and the next update of stime_platform_stamp. In this situation the calculation would end up a bit different meaning subsequent calls of read_platform_stime() would return the equivalent to scale_delta(plt_src->read_counter(), plt_scale), as opposed to computing with a previous taken stime_platform_stamp and incrementing a difference with a counter read. But can this situation actually happen? > >> @@ -566,7 +642,9 @@ static void __init init_platform_timer(void) >> struct platform_timesource *pts = NULL; >> int i, rc = -1; >> >> - if ( opt_clocksource[0] != '\0' ) >> + /* clocksource=tsc is initialized later when all CPUS are up */ >> + if ( (opt_clocksource[0] != '\0') && >> + (strcmp(opt_clocksource, "tsc") != 0) ) > > In line with the inverse check done below (using ! instead of == 0) > please omit the != 0 here. > OK. >> @@ -1437,6 +1515,20 @@ static int __init verify_tsc_reliability(void) >> } >> } >> >> + if ( !strcmp(opt_clocksource, "tsc") ) >> + { >> + if ( try_platform_timer(&plt_tsc) > 0 ) >> + { >> + printk(XENLOG_INFO "Switched to Platform timer %s TSC\n", >> + freq_string(plt_src.frequency)); >> + >> + init_percpu_time(); > > So you re-do this for the BP, but not for any of the APs? I had overlooked the invocation made in start_secondary(). I also need to redo this on the APs. > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.