[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 2/4] x86/APIC: calibrate against platform timer when possible


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 15 Mar 2022 10:12:51 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ulnSyNwM29gB7uPPz885DJ6bbRUuMWcBYf/B3mmJW80=; b=OVNMsOKzsMe0aIA7b3FJq9KoizS22a0yO5wrry5KUqB9LXUy92sOPcecu4x+/KdXh1wYaLu4x/HhgUIRxIhxToHqKN0CyrHOY3pUVStMX1rGiU1cJHqxUormFgxmsY58Xy5mlQbCdjednSlCTDlW2nBAIvZTFoqPxwvPIOv15xJRrPr+htKsME7WVYScCP07CJfR65wrMTMYY82nLCsuaS/hJbwaYmwt4+CHGhAEa8/PFOvW56vKOhPt6LJ6pRaiKKBxVwsB2rLh7KhPFkiwwa6jsgBHDyJ/fI/M4ERhlJEu/a2Hf6XvRPmwBi4a92g1m68rJ5wccgW+RD+uBOZM3A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=l9G850S9vR1otsqpE1bh2Ofn3A8nQkBxH9uBjxaORwvjsETtxbwDtgmlHPqtmDsHwx6pkqrdQ+f5A0NbMspexuh7NGkvXbx2rOXTeSSedNF0000E3PqJGgLxLMVUrbMM/ZMjaI3jTxjgrJozqFqUtuqKZYAf3/jc+k68ae2DBrK3Fc8yznG9vcCd5yB+7r5fDI2Ypys0R/AKMizCAS0lYolkgFqteApYzuaN1WWH77gfsO1Y98SxYuRmsANEJrBSW+IVD6uR2WRJcUEBLKuwSljoM+RAq6iJ5v5Nw/eS+K6fx6SkDsxzadK9i3K89s0mCzCHJnrfYYcrqz3S2QLPVg==
  • Authentication-results: esa3.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 15 Mar 2022 09:13:13 +0000
  • Ironport-data: A9a23:GzQnOq9U+p6XKiAIwhuxDrUDl36TJUtcMsCJ2f8bNWPcYEJGY0x3y DMcCmuPb/mMM2unL9Byatu18U1T6sXcm4IyGgtuq308E34SpcT7XtnIdU2Y0wF+jyHgoOCLy +1EN7Es+ehtFie0Si+Fa+Sn9T8mvU2xbuKU5NTsY0idfic5DnZ54f5fs7Rh2NQw2oHkW1rlV e7a+KUzBnf0g1aYDUpMg06zgEsHUCPa4W5wUvQWPJinjXeG/5UnJMt3yZKZdhMUdrJ8DO+iL 9sv+Znilo/vE7XBPfv++lrzWhVirrc/pmFigFIOM0SpqkAqSiDfTs/XnRfTAKtao2zhojx/9 DlCnaWKWDxwEY/Coc9DAjRdAhBZNKIY1rCSdBBTseTLp6HHW37lwvEoB0AqJ4wIvO1wBAmi9 9RBdmpLNErawbvrnvTrEYGAhex6RCXvFJkYtXx6iynQEN4tQIzZQrWM7thdtNs1rp4QR6uOO JFCAdZpRB3taj1CG2oMM4I7mNW5uWbteTJkhXvA8MLb5ECMlVcsgdABKuH9ZdiiVchT2EGCq Qru72n/Rx0XKtGb4T6E6W63wP/CmzvhX4AfH6H+8eRl6HWRzGEODBwdVXOgvOK0zEW5Xrpix 1c8o3R06/JorQryE4e7D0bQTGO4UgA0YuBBKLQVyg+04fSFzgOIN08NQWBvZ4lz3CMpfgAC2 liMltLvIDVgtryJVH6QnoupQSOO1Ts9djFbO3JdJecRy5y6+dxo0EqTJjp2OPTt5uAZDw0c1 NxjQMIWo7wIxfAG2Kyglbwsq2L9/8OZJuLZC+i+Y45E0u+bTNP9D2BLwQKChRqlEGp/ZgLZ1 JTjs5LChN3i9bnXyESwrBwlRdlFHcqtPjzGmkJIFJI87Tmr8HPLVdkOvG4kdR0waJZdI2WBj KrvVeV5v8Y70JyCN/MfXm5MI55ykfiI+SrNC5g4keaikrAuLVTarUmClGab3nz3kVhErE3ME czzTCpYNl5DUf4P5GPvH481iOZ3rghjlTK7bc2qlHyPjOvBDEN5vJ9YaTNimMhit/jayOgUm v4CX/a3J+J3C7SvPHOIrdZNcTjn7xETXPjLliCeTcbaSiJOE2A9Ef7Bh7Qnfo1uhaNOkenUu Hq6XydlJJDX3BUr9S3ihqhfVY7S
  • Ironport-hdrordr: A9a23:NR9pZq9y8J015TKy1Nhuk+E6db1zdoMgy1knxilNoENuHfBwxv rDoB1E73LJYVYqOU3Jmbi7Sc69qFfnhORICO4qTMqftWjdyRCVxeRZg7cKrAeQeREWmtQtsJ uINpIOdOEYbmIK/PoSgjPIaurIqePvmMvD5Za8854ud3ATV0gJ1XYGNu/xKDwReOApP+tcKH LKjfA32AZINE5nJviTNz0gZazuttfLnJXpbVovAAMm0hCHiXeN5KThGxaV8x8CW3cXqI1Su1 Ttokjc3OGOovu7whjT2yv66IlXosLozp9mCNaXgsYYBz3wgkKDZZhnWZeFoDcpydvfo2oCoZ 3pmVMNLs5z43TeciWcpgbs4RDp1HIU53rr2Taj8AzeiP28YAh/J9tKhIpffBecwVEnpstA3K VC2H/cn4ZLDDvb9R6NqOTgZlVPrA6ZsHAimekcgzh0So0FcoJcqoQZ4Qd8DIoAJiTn84oqed MeQP003MwmMG9yUkqp/lWGmLeXLzcO91a9MwU/U/WuonZrdCsT9Tpb+CQd9k1wgK7VBaM0ot gsCZ4Y542mfvVmHZ6VO91xM/dfKla9Ny4kY1jiaGgOKsk8SgfwQtjMkfEI2N0=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Mar 14, 2022 at 05:19:37PM +0100, Jan Beulich wrote:
> On 11.03.2022 14:45, Roger Pau Monné wrote:
> > On Mon, Feb 14, 2022 at 10:25:11AM +0100, Jan Beulich wrote:
> >> Use the original calibration against PIT only when the platform timer
> >> is PIT. This implicitly excludes the "xen_guest" case from using the PIT
> >> logic (init_pit() fails there, and as of 5e73b2594c54 ["x86/time: minor
> >> adjustments to init_pit()"] using_pit also isn't being set too early
> >> anymore), so the respective hack there can be dropped at the same time.
> >> This also reduces calibration time from 100ms to 50ms, albeit this step
> >> is being skipped as of 0731a56c7c72 ("x86/APIC: no need for timer
> >> calibration when using TDT") anyway.
> >>
> >> While re-indenting the PIT logic in calibrate_APIC_clock(), besides
> >> adjusting style also switch around the 2nd TSC/TMCCT read pair, to match
> >> the order of the 1st one, yielding more consistent deltas.
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> >> ---
> >> Open-coding apic_read() in read_tmcct() isn't overly nice, but I wanted
> >> to avoid x2apic_enabled being evaluated twice in close succession. (The
> >> barrier is there just in case only anyway: While this RDMSR isn't
> >> serializing, I'm unaware of any statement whether it can also be
> >> executed speculatively, like RDTSC can.) An option might be to move the
> >> function to apic.c such that it would also be used by
> >> calibrate_APIC_clock().
> > 
> > I think that would make sense. Or else it's kind of orthogonal that we
> > use a barrier in calibrate_apic_timer but not in calibrate_APIC_clock.
> 
> But there is a barrier there, via rdtsc_ordered(). Thinking about
> this again, I'm not not even sure I'd like to use the helper in
> calibrate_APIC_clock(), as there's no need to have two barriers
> there.
> 
> But I guess I'll move the function in any event, so it at least
> feels less like a layering violation. But I still would want to
> avoid calling apic_read(), i.e. the function would remain as is
> (albeit perhaps renamed as becoming non-static).
> 
> > But maybe we can get rid of the open-coded PIT calibration in
> > calibrate_APIC_clock? (see below)
> > 
> >> --- a/xen/arch/x86/time.c
> >> +++ b/xen/arch/x86/time.c
> >> @@ -26,6 +26,7 @@
> >>  #include <xen/symbols.h>
> >>  #include <xen/keyhandler.h>
> >>  #include <xen/guest_access.h>
> >> +#include <asm/apic.h>
> >>  #include <asm/io.h>
> >>  #include <asm/iocap.h>
> >>  #include <asm/msr.h>
> >> @@ -1004,6 +1005,78 @@ static u64 __init init_platform_timer(vo
> >>      return rc;
> >>  }
> >>  
> >> +static uint32_t __init read_tmcct(void)
> >> +{
> >> +    if ( x2apic_enabled )
> >> +    {
> >> +        alternative("lfence", "mfence", X86_FEATURE_MFENCE_RDTSC);
> >> +        return apic_rdmsr(APIC_TMCCT);
> >> +    }
> >> +
> >> +    return apic_mem_read(APIC_TMCCT);
> >> +}
> >> +
> >> +static uint64_t __init read_pt_and_tmcct(uint32_t *tmcct)
> >> +{
> >> +    uint32_t tmcct_prev = *tmcct = read_tmcct(), tmcct_min = ~0;
> >> +    uint64_t best = best;
> >> +    unsigned int i;
> >> +
> >> +    for ( i = 0; ; ++i )
> >> +    {
> >> +        uint64_t pt = plt_src.read_counter();
> >> +        uint32_t tmcct_cur = read_tmcct();
> >> +        uint32_t tmcct_delta = tmcct_prev - tmcct_cur;
> >> +
> >> +        if ( tmcct_delta < tmcct_min )
> >> +        {
> >> +            tmcct_min = tmcct_delta;
> >> +            *tmcct = tmcct_cur;
> >> +            best = pt;
> >> +        }
> >> +        else if ( i > 2 )
> >> +            break;
> >> +
> >> +        tmcct_prev = tmcct_cur;
> >> +    }
> >> +
> >> +    return best;
> >> +}
> >> +
> >> +uint64_t __init calibrate_apic_timer(void)
> >> +{
> >> +    uint32_t start, end;
> >> +    uint64_t count = read_pt_and_tmcct(&start), elapsed;
> >> +    uint64_t target = CALIBRATE_VALUE(plt_src.frequency), actual;
> >> +    uint64_t mask = (uint64_t)~0 >> (64 - plt_src.counter_bits);
> >> +
> >> +    /*
> >> +     * PIT cannot be used here as it requires the timer interrupt to 
> >> maintain
> >> +     * its 32-bit software counter, yet here we run with IRQs disabled.
> >> +     */
> > 
> > The reasoning in calibrate_APIC_clock to have interrupts disabled
> > doesn't apply anymore I would think (interrupts are already enabled
> > when we get there),
> 
> setup_boot_APIC_clock() disables IRQs before calling
> calibrate_APIC_clock(). Whether the reasoning still applies is hard
> to tell - I at least cannot claim I fully understand the concern.

Me neither, I'm not sure what will explicitly need the first
interrupt, and why further interrupts won't be fine.

Also interrupts are already enabled before calling
calibrate_APIC_clock() (as it's setup_boot_APIC_clock() that disables
them), so this whole thing about getting the first interrupt seems
very bogus and plain wrong.

> > and hence it seems to me that calibrate_APIC_clock
> > could be called with interrupts enabled and we could remove the
> > open-coded usage of the PIT in calibrate_APIC_clock.
> 
> I won't exclude this might be possible, but it would mean changing
> a path which is hardly ever used nowadays. While on one hand this
> means hardly anyone might notice, otoh it also means possible
> breakage might not be noticed until far in the future. It anyway
> feels too much for a single change to also alter calibration against
> PIT right here.

You are already changing this path by using a clocksource different
than PIT to perform the calibration.

> One thing seems quite clear though: Doing any of this with interrupts
> enabled increases the chances for the read pairs to not properly
> correlate, due to an interrupt happening in the middle. This alone is
> a reason for me to want to keep IRQs off here.

Right, TSC calibration is also done with interrupts disabled, so it
does seem correct to do the same here for APIC.

Maybe it would be cleaner to hide the specific PIT logic in
calibrate_apic_timer() so that we could remove get_8254_timer_count()
and wait_8254_wraparound() from apic.c and apic.c doesn't have any PIT
specific code anymore?

I think using channel 2 like it's used for the TSC calibration won't
be possible at this point, since it will skew read_pit_count() users?
In any case if we disable interrupts those will already be skewed
because the timer won't be rearmed until interrupts are enabled.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.