[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 07/15] x86: implement set value flow for MBA



On 17-08-30 09:31:04, Roger Pau Monn� wrote:
> On Thu, Aug 24, 2017 at 09:14:41AM +0800, Yi Sun wrote:
> > It also changes the memebers in 'cos_write_info' to transfer the
> > feature array, feature properties array and value array. Then, we
> > can write all features values on the cos id into MSRs.
> > 
> > Because multiple features may co-exist, we need handle all features to write
> > values of them into a COS register with new COS ID. E.g:
> > 1. L3 CAT and MBA co-exist.
> > 2. Dom1 and Dom2 share a same COS ID (2). The L3 CAT CBM of Dom1 is 0x1ff,
> >    the MBA Thrtle of Dom1 is 0xa.
> > 3. User wants to change MBA Thrtl of Dom1 to be 0x14. Because COS ID 2 is
> >    used by Dom2 too, we have to pick a new COS ID 3. The original values of
> >    Dom1 on COS ID 3 may be below:
> 
> What original values? You said you pick COS ID 3, because I think it's
> assumed to be empty? In which case there are no original values in COS
> ID 3.
> 
Sorry for confusion. The original value means the default value. For CAT, it is
0x7ff on my machine which is shown below.

> >            ---------
> >            | COS 3 |
> >            ---------
> >    L3 CAT  | 0x7ff |
> >            ---------
> >    MBA     | 0x0   |
> >            ---------
> > diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> > index 4a0c982..ce82975 100644
> > --- a/xen/arch/x86/psr.c
> > +++ b/xen/arch/x86/psr.c
> > @@ -138,6 +138,12 @@ static const struct feat_props {
> >  
> >      /* write_msr is used to write out feature MSR register. */
> >      void (*write_msr)(unsigned int cos, uint32_t val, enum psr_val_type 
> > type);
> > +
> > +    /*
> > +     * check_val is used to check if input val fulfills SDM requirement.
> > +     * Change it to valid value if SDM allows.
> 
> I'm not really sure it's a good idea to change the value to a valid
> one, IMHO you should just check and print an error if the value is
> invalid (and return false of course).
> 
Per SDM, the HW has ability to automatically change the input value to what it
wants. E.g:
  Linear mode: HW wants the input value be 10/20/30/.../90. But user inputs 15.
               Then, HW can automatically change it to 10.

Even user inputs a value that does not fulfill HW requirement, HW can handle it.
So, we do not need return error to user. Otherwise, user needs to know details
of MBA.

But the issue here is how we get the actual value and show it to user. There are
two ways to do that:
1. When setting value, check and change it to valid one and save it to our 
cache.
2. When getting value, call rdmsr to read the actual value back from HW.

I think option 1 has better performance and the code looks better.

> > +     */
> > +    bool (*check_val)(const struct feat_node *feat, unsigned long *val);
> >  } *feat_props[FEAT_TYPE_NUM];
> >  
[...]

> >  /* L3 CAT props */
> >  static void l3_cat_write_msr(unsigned int cos, uint32_t val,
> >                               enum psr_val_type type)
> > @@ -446,6 +453,7 @@ static const struct feat_props l3_cat_props = {
> >      .alt_type = PSR_VAL_TYPE_UNKNOWN,
> >      .get_feat_info = cat_get_feat_info,
> >      .write_msr = l3_cat_write_msr,
> > +    .check_val = cat_check_cbm,
> >  };
> 
> Maybe the introduction of check_val should be a separate patch? It's
> mostly code movement and some fixup.
> 
Ok, may consider it.

[...]

> > +static bool mba_check_thrtl(const struct feat_node *feat, unsigned long 
> > *thrtl)
> > +{
> > +    if ( *thrtl > feat->mba_info.thrtl_max )
> > +        return false;
> > +
> > +    /*
> > +     * Per SDM (chapter "Memory Bandwidth Allocation Configuration"):
> > +     * 1. Linear mode: In the linear mode the input precision is defined
> > +     *    as 100-(MBA_MAX). For instance, if the MBA_MAX value is 90, the
> > +     *    input precision is 10%. Values not an even multiple of the
> > +     *    precision (e.g., 12%) will be rounded down (e.g., to 10% delay
> > +     *    applied).
> > +     * 2. Non-linear mode: Input delay values are powers-of-two from zero
> > +     *    to the MBA_MAX value from CPUID. In this case any values not a
> > +     *    power of two will be rounded down the next nearest power of two.
> > +     */
> > +    if ( feat->mba_info.linear )
> > +    {
> > +        unsigned int mod;
> > +
> > +        mod = *thrtl % (100 - feat->mba_info.thrtl_max);
> > +        *thrtl -= mod;
> > +    }
> > +    else
> > +    {
> > +        /* Not power of 2. */
> > +        if ( *thrtl && (*thrtl & (*thrtl - 1)) )
> 
> This can be joined with the else to avoid another indentation level:
> 
> else if ( *thrtl && (*thrtl & (*thrtl - 1)) )
> ...
> 
Thanks!

> > +            *thrtl = *thrtl & (1 << (flsl(*thrtl) - 1));
> > +    }
> > +
> > +    return true;
> >  }
> >  
[...]

> >  static void do_write_psr_msrs(void *data)
> >  {
> >      const struct cos_write_info *info = data;
> > -    struct feat_node *feat = info->feature;
> > -    const struct feat_props *props = info->props;
> > -    unsigned int i, cos = info->cos, cos_num = props->cos_num;
> > +    unsigned int i, j, index = 0, array_len = info->array_len, cos = 
> > info->cos;
> > +    const uint32_t *val_array = info->val;
> >  
> > -    for ( i = 0; i < cos_num; i++ )
> > +    for ( i = 0; i < ARRAY_SIZE(feat_props); i++ )
> >      {
> 
> index and j can be defined here, they are only used inside of this for
> loop AFAICT.
> 
I think definition of j can be moved into the loop. But index cannot unless I
declared it to be 'static'. The index is used as a accumulator.

> > -        if ( feat->cos_reg_val[cos * cos_num + i] != info->val[i] )
> > +        struct feat_node *feat = info->features[i];
> > +        const struct feat_props *props = info->props[i];
> > +        unsigned int cos_num;
> > +
> > +        if ( !feat || !props )
> > +            continue;
> > +
> > +        cos_num = props->cos_num;
> > +        if ( array_len < cos_num )
> > +            return;
> > +
> > +        for ( j = 0; j < cos_num; j++ )
> >          {
> > -            feat->cos_reg_val[cos * cos_num + i] = info->val[i];
> > -            props->write_msr(cos, info->val[i], props->type[i]);
> > +            if ( feat->cos_reg_val[cos * cos_num + j] != val_array[index + 
> > j] )
> > +            {
> > +                feat->cos_reg_val[cos * cos_num + j] = val_array[index + 
> > j];
> > +                props->write_msr(cos, val_array[index + j], 
> > props->type[j]);
> > +            }
> >          }
> > +
> > +        array_len -= cos_num;
> > +        index += cos_num;
> >      }
> >  }
> >  
> > @@ -1224,30 +1289,19 @@ static int write_psr_msrs(unsigned int socket, 
> > unsigned int cos,
> >                            const uint32_t val[], unsigned int array_len,
> >                            enum psr_feat_type feat_type)
> >  {
> > -    int ret;
> >      struct psr_socket_info *info = get_socket_info(socket);
> 
> info should probably be const here.
> 
info could not be const. Because 'features' in 'cos_write_info' could not be
const.

> >      struct cos_write_info data =
> >      {
> >          .cos = cos,
> > -        .feature = info->features[feat_type],
> > -        .props = feat_props[feat_type],
> > +        .features = info->features,
> > +        .val = val,
> > +        .array_len = array_len,
> > +        .props = feat_props,
> >      };
> 
> AFAICT data should also be const, but I guess this is not going to
> work because on_selected_cpus expects a non-const payload?
> 
Right, data cannot be const per 'on_selected_cpus' requirement.

> Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.