[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 3/6] x86/time: implement tsc as clocksource



On 04/05/2016 11:43 AM, Jan Beulich wrote:
>>>> On 29.03.16 at 15:44, <joao.m.martins@xxxxxxxxxx> wrote:
>> Introduce support for using TSC as platform time which is the highest
>> resolution time and most performant to get (~20 nsecs).  Though there
>> are also several problems associated with its usage, and there isn't a
>> complete (and architecturally defined) guarantee that all machines
>> will provide reliable and monotonic TSC across all CPUs, on different
>> sockets and different P/C states.  I believe Intel to be the only that
>> can guarantee that. For this reason it's only set when adminstrator
>> changes "clocksource" boot option to "tsc". Initializing TSC
>> clocksource requires all CPUs to have the tsc reliability checks
>> performed. init_xen_time is called before all CPUs are up, and so we
>> start with HPET at boot time, and switch later to TSC.
> 
> Please don't make possibly wrong statements: There's no guarantee
> we'd start with HPET - might as well be PMTMR or PIT.
My apologies - while the commit message doesn't express this, I meant it to be
"for example we would start with HPET (...)". I gave that example as it's the
one preferable in plt_timers.

> 
>> The switch then
>> happens on verify_tsc_reliability initcall that is invoked when all
>> CPUs are up. When attempting to initializing TSC we also check for
>> time warps and appropriate CPU features i.e.  TSC_RELIABLE,
>> CONSTANT_TSC and NONSTOP_TSC. And in case none of these conditions are
>> met, we keep the clocksource that was previously initialized on
>> init_xen_time.
> 
> DYM "And in case any of these conditions is not met, ..."?
Yes. My use of "none" may be wrong here so just to be sure what I mean is: (...)
If all of the conditions in the previous sentence are not met (...)

> Or,
> looking at the code, something more complex?
> 
>> --- a/xen/arch/x86/time.c
>> +++ b/xen/arch/x86/time.c
>> @@ -432,6 +432,63 @@ uint64_t ns_to_acpi_pm_tick(uint64_t ns)
>>  }
>>  
>>  /************************************************************
>> + * PLATFORM TIMER 4: TSC
>> + */
>> +static u64 tsc_freq;
>> +static unsigned long tsc_max_warp;
>> +static void tsc_check_reliability(void);
>> +
>> +static int __init init_tsctimer(struct platform_timesource *pts)
>> +{
>> +    bool_t tsc_reliable = 0;
>> +
>> +    tsc_check_reliability();
>> +
>> +    if ( tsc_max_warp > 0 )
>> +    {
>> +        tsc_reliable = 0;
> 
> Redundant assignment.
> 
Yes, Konrad had point that out too. Fixed that.

>> +static void resume_tsctimer(struct platform_timesource *pts)
>> +{
>> +}
> 
> No need for an empty function - other clock sources don't have
> such empty stubs either (i.e. the caller checks for NULL).
> 
OK.

>> @@ -541,6 +613,10 @@ static int __init try_platform_timer(struct 
>> platform_timesource *pts)
>>      if ( rc <= 0 )
>>          return rc;
>>  
>> +    /* We have a platform timesource already so reset it */
>> +    if ( plt_src.counter_bits != 0 )
>> +        reset_platform_timer();
> 
> What if any of the time functions get used between here and the
> setting of the new clock source? For example, what will happen to
> log messages when "console_timestamps=..." is in effect?
time functions will use NOW() (i.e. get_s_time) which uses cpu_time structs
being updated on local_time_calibration() or cpu frequency changes.
reset_platform_timer() will zero out some of the variables used by the
plt_overflow and read_platform_stime(). The overflow and calibration isn't an
issue as the timers are previously killed so any further calls will use what's
on cpu_time while plt_src is being changed.

The case I could see this being different is if cpu_frequency_change is called
between reset_platform_time() and the next update of stime_platform_stamp. In
this situation the calculation would end up a bit different meaning subsequent
calls of read_platform_stime() would return the equivalent to
scale_delta(plt_src->read_counter(), plt_scale), as opposed to computing with a
previous taken stime_platform_stamp and incrementing a difference with a counter
read. But can this situation actually happen?

> 
>> @@ -566,7 +642,9 @@ static void __init init_platform_timer(void)
>>      struct platform_timesource *pts = NULL;
>>      int i, rc = -1;
>>  
>> -    if ( opt_clocksource[0] != '\0' )
>> +    /* clocksource=tsc is initialized later when all CPUS are up */
>> +    if ( (opt_clocksource[0] != '\0') &&
>> +         (strcmp(opt_clocksource, "tsc") != 0) )
> 
> In line with the inverse check done below (using ! instead of == 0)
> please omit the != 0 here.
> 
OK.

>> @@ -1437,6 +1515,20 @@ static int __init verify_tsc_reliability(void)
>>          }
>>      }
>>  
>> +    if ( !strcmp(opt_clocksource, "tsc") )
>> +    {
>> +        if ( try_platform_timer(&plt_tsc) > 0 )
>> +        {
>> +            printk(XENLOG_INFO "Switched to Platform timer %s TSC\n",
>> +                   freq_string(plt_src.frequency));
>> +
>> +            init_percpu_time();
> 
> So you re-do this for the BP, but not for any of the APs?
I had overlooked the invocation made in start_secondary(). I also need to redo
this on the APs.

> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.