[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v3 09/15] xen/x86: introduce a new amd cppc driver for cpufreq scaling


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: "Penny, Zheng" <penny.zheng@xxxxxxx>
  • Date: Thu, 3 Apr 2025 07:40:40 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xnRVie7wqBFMKEtXvmySiuQpBycrPn+lvfydBL9UVO8=; b=X3lcZ0iD8tFQSSuB8f9cek5pwHYCf5t6XL5mlv/Ro6h8253IMu0RVgWXd0KrBfG91nrtl2IJAeVdzSsOh43x6eCvpCbskBN/sGG4dAejo3Fwm/hNA4PRHKmTvKsLdjKNLr0fseYXVEOx5Wtvj0J2EVqg1gG9UIUFsV53qrQGPwv1QmPcsHjAFvlvXoyEDKFqMhNSMKNlKK0FLZICo2nkUXe7wQYfwbi2+HV8NRdOn882QwHAK/3NS6X6nwPtPaXrVtnJyj4jMZw7kd/8hxURTcQGX1nAHllMg/nvcyshWynkS9GllnCprzK9lOdu5cdSLcsHxSVn3EjVsCAjpsIfKg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=JVuJjMUt8mcOhVCk9HEwWdukAAyXPo8Sa6RPY/yVdyETtO0don9BvDZ+uZvi6W/yVCQytXVVGE2Hs3BrWBKHqIlvwr335GEyL4Xpz7jxBYwWHwlxExpCFSlyKbRJ2hdinbIPx3kfreMUiRx25VfMkx2HNSV42cAMig7dPozYzB5MV3x1Gw+eYrE+LLHy6+hN1fo+jfn3/cFtlnA5Zzb7Mze3QYi9bwXC76otSWF2siU5tTeQ6MHykqWg7+vdg1BMsSweVde5y93aYgPfF48Dq+IYIu929l6O4dVM7v7M0P3FwaGoNzDVaMJZdHYp1D5hlhyS+/+sXbgxrwRPzlc79Q==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com;
  • Cc: "Huang, Ray" <Ray.Huang@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jason Andryuk <jandryuk@xxxxxxxxx>
  • Delivery-date: Thu, 03 Apr 2025 07:40:53 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Msip_labels: MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_ActionId=b0f85d97-5b3e-4795-abe3-9ae0a6a143c7;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_ContentBits=0;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_Enabled=true;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_Method=Privileged;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_Name=Open Source;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_SetDate=2025-04-02T07:58:09Z;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_SiteId=3dd8961f-e488-4e60-8e11-a82d994e183d;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_Tag=10, 0, 1, 1;
  • Thread-index: AQHbjnNzqp5kMkMxq06YB77LYWaCT7ODu8GAgAxxPEA=
  • Thread-topic: [PATCH v3 09/15] xen/x86: introduce a new amd cppc driver for cpufreq scaling

[Public]

> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: Tuesday, March 25, 2025 5:58 PM
> To: Penny, Zheng <penny.zheng@xxxxxxx>
> Cc: Huang, Ray <Ray.Huang@xxxxxxx>; Andrew Cooper
> <andrew.cooper3@xxxxxxxxxx>; Roger Pau Monné <roger.pau@xxxxxxxxxx>; xen-
> devel@xxxxxxxxxxxxxxxxxxxx; Jason Andryuk <jandryuk@xxxxxxxxx>
> Subject: Re: [PATCH v3 09/15] xen/x86: introduce a new amd cppc driver for
> cpufreq scaling
>
> > +         * directly use nominal_mhz and lowest_mhz as the division
> > +         * will remove the frequency unit.
> > +         */
> > +        div = div ?: 1;
>
> Imo the cppc_data->lowest_mhz >= cppc_data->nominal_mhz case better
> wouldn't make it here, but use the fallback path below. Or special- case
> cppc_data->lowest_mhz == cppc_data->nominal_mhz: mul would
> (hopefully) be zero (i.e. there would be the expectation that
> data->caps.nominal_perf == data->caps.lowest_perf, yet no guarantee
> without checking), and hence ...

Okay, I'll drop the " div = div ?: 1", to strict the if() to
```
if ( cppc_data->cpc.lowest_mhz && cppc_data->cpc.nominal_mhz &&
     (cppc_data->cpc.lowest_mhz != cppc_data->cpc.nominal_mhz) )
```

>
> > +        offset = data->caps.nominal_perf -
> > +                 (mul * cppc_data->nominal_mhz) / div;
>
> ... offset = data->caps.nominal_perf regardless of "div" (as long as that's 
> not zero).
> I.e. the "equal" case may still be fine to take this path.
>
> Or is there a check somewhere that lowest_mhz <= nominal_mhz and lowest_perf
> <= nominal_perf, which I'm simply overlooking?
>

Yes. I overlooked the scenario that lowest_mhz > nominal_mhz and lowest_perf > 
nominal_perf
and I'll add the check on first read

> > +#define amd_get_freq(name)                                                 
> >  \
>
> The macro parameter is used just ...
>
> > +    static int amd_get_##name##_freq(const struct amd_cppc_drv_data
> > + *data,  \
>
> ... here, ...
>
> > +                                     unsigned int *freq)                   
> >  \
> > +    {                                                                      
> >  \
> > +        const struct xen_processor_cppc *cppc_data = data->cppc_data;      
> >  \
> > +        uint64_t mul, div, res;                                            
> >  \
> > +                                                                           
> >  \
> > +        if ( cppc_data->name##_mhz )                                       
> >  \
> > +        {                                                                  
> >  \
> > +            /* Switch to khz */                                            
> >  \
> > +            *freq = cppc_data->name##_mhz * 1000;                          
> >  \
>
> ... twice here forthe MHz value, and ...
>
> > +            return 0;                                                      
> >  \
> > +        }                                                                  
> >  \
> > +                                                                           
> >  \
> > +        /* Read Processor Max Speed(mhz) as anchor point */                
> >  \
> > +        mul = this_cpu(amd_max_freq_mhz);                                  
> >  \
> > +        if ( !mul )                                                        
> >  \
> > +            return -EINVAL;                                                
> >  \
> > +        div = data->caps.highest_perf;                                     
> >  \
> > +        res = (mul * data->caps.name##_perf * 1000) / div;                 
> >  \
>
> ... here for the respective perf indicator. Why does it take ...
>
> > +        if ( res > UINT_MAX )                                              
> >  \
> > +        {                                                                  
> >  \
> > +            printk(XENLOG_ERR                                              
> >  \
> > +                   "Frequeny exceeds maximum value UINT_MAX: %lu\n", res); 
> >  \
> > +            return -EINVAL;                                                
> >  \
> > +        }                                                                  
> >  \
> > +        *freq = (unsigned int)res;                                         
> >  \
> > +                                                                           
> >  \
> > +        return 0;                                                          
> >  \
> > +    }                                                                      
> >  \
> > +
> > +amd_get_freq(lowest);
> > +amd_get_freq(nominal);
>
> ... two almost identical functions, when one (with two extra input 
> parameters) would
> suffice?
>

I had a draft fix here, If it doesn't what you hope for, plz let me know
```
static int amd_get_lowest_and_nominal_freq(const struct amd_cppc_drv_data *data,
                                           unsigned int *lowest_freq,
                                           unsigned int *nominal_freq)
{
    const struct xen_processor_cppc *cppc_data = data->cppc_data;
    uint64_t mul, div, res;
    uint8_t perf;

    if ( !lowest_freq && !nominal_freq )
        return -EINVAL;

    if ( lowest_freq && cppc_data->cpc.lowest_mhz )
        /* Switch to khz */
        *lowest_freq = cppc_data->cpc.lowest_mhz * 1000;

    if ( nominal_freq && cppc_data->cpc.nominal_mhz )
        /* Switch to khz */
        *nominal_freq = cppc_data->cpc.nominal_mhz * 1000;

    /* Still have unresolved frequency */
    if ( (lowest_freq && !(*lowest_freq)) ||
         (nominal_freq && !(*nominal_freq)) )
    {
        do {
            /* Calculate lowest frequency firstly if need */
            if ( lowest_freq && !(*lowest_freq) )
                perf = data->caps.lowest_perf;
            else
                perf = data->caps.nominal_perf;

            /* Read Processor Max Speed(MHz) as anchor point */
            mul = this_cpu(amd_max_pxfreq_mhz);
            if ( mul == INVAL_FREQ_MHZ || !mul )
            {
                printk(XENLOG_ERR
                       "Failed to read valid processor max frequency as anchor 
point: %lu\n",
                       mul);
                return -EINVAL;
            }
            div = data->caps.highest_perf;
            res = (mul * perf * 1000) / div;

            if ( res > UINT_MAX || !res )
            {
                printk(XENLOG_ERR
                       "Frequeny exceeds maximum value UINT_MAX or being zero 
value: %lu\n",
                       res);
                return -EINVAL;
            }

            if ( lowest_freq && !(*lowest_freq) )
                *lowest_freq = (unsigned int)res;
            else
                *nominal_freq = (unsigned int)res;
        } while ( nominal_freq && !(*nominal_freq) );
    }

    return 0;
}
```

> In amd_cppc_khz_to_perf() you have a check to avoid division by zero. Why not
> the same safeguarding here?
>

div = data->caps.highest_perf; For highest_perf non-zero check, it is already 
added
in  amd_cppc_init_msrs()

> > +static int amd_get_max_freq(const struct amd_cppc_drv_data *data,
> > +                            unsigned int *max_freq) {
> > +    unsigned int nom_freq, boost_ratio;
> > +    int res;
> > +
> > +    res = amd_get_nominal_freq(data, &nom_freq);
> > +    if ( res )
> > +        return res;
> > +
> > +    boost_ratio = (unsigned int)(data->caps.highest_perf /
> > +                                 data->caps.nominal_perf);
>
> Similarly here - I can't spot what would prevent division by zero.
>

In amd_cppc_init_msrs(), before calculating the frequency, we have checked
all caps.xxx_perf info shall not be zero.
I'll complement check to avoid "highest_perf < nominal_perf", to ensure that
the calculation result of boost_ratio must not be zero.
```
    if ( data->caps.highest_perf == 0 || data->caps.lowest_perf == 0 ||
         data->caps.nominal_perf == 0 || data->caps.lowest_nonlinear_perf == 0 
||
         data->caps.lowest_perf > data->caps.lowest_nonlinear_perf ||
         data->caps.lowest_nonlinear_perf > data->caps.nominal_perf ||
         data->caps.nominal_perf > data->caps.highest_perf )
```

>
> Jan

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.