[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v2 1/2] xen/memory : Add a stats_table resource type



On Thu, Feb 16, 2023 at 04:15:29PM +0100, Jan Beulich wrote:
> On 16.02.2023 16:07, Matias Ezequiel Vara Larsen wrote:
> > On Wed, Dec 14, 2022 at 08:29:53AM +0100, Jan Beulich wrote:
> >> On 07.10.2022 14:39, Matias Ezequiel Vara Larsen wrote:
> >>> @@ -287,6 +289,20 @@ static inline void vcpu_runstate_change(
> >>>      }
> >>>  
> >>>      v->runstate.state = new_state;
> >>> +
> >>> +    vcpustats_va = (shared_vcpustatspage_t*)d->vcpustats_page.va;
> >>> +    if ( vcpustats_va )
> >>> +    {
> >>> + vcpustats_va->vcpu_info[v->vcpu_id].version =
> >>> +     version_update_begin(vcpustats_va->vcpu_info[v->vcpu_id].version);
> >>> +        smp_wmb();
> >>> +        
> >>> memcpy(&vcpustats_va->vcpu_info[v->vcpu_id].runstate_running_time,
> >>> +               &v->runstate.time[RUNSTATE_running],
> >>> +               sizeof(v->runstate.time[RUNSTATE_running]));
> >>> +        smp_wmb();
> >>> +        vcpustats_va->vcpu_info[v->vcpu_id].version =
> >>> +            
> >>> version_update_end(vcpustats_va->vcpu_info[v->vcpu_id].version);
> >>> +    }
> >>
> >> A further aspect to consider here is cache line ping-pong. I think the
> >> per-vCPU elements of the array want to be big enough to not share a
> >> cache line. The interface being generic this presents some challenge
> >> in determining what the supposed size is to be. However, taking into
> >> account the extensibility question, maybe the route to take is to
> >> simply settle on a power-of-2 value somewhere between x86'es and Arm's
> >> cache line sizes and the pretty common page size of 4k, e.g. 512 bytes
> >> or 1k?
> >>
> > 
> > I do not now how to address this. I was thinking to align each vcpu_stats
> > instance to a multiple of the cache-line. I would pick up the first multiple
> > that is bigger to the size of the vcpu_stats structure. For example, 
> > currently
> > the structure is 16 bytes so I would align each instance in a frame to 64
> > bytes. Would it make sense? 
> 
> Well, 64 may be an option, but I gave higher numbers for a reason. One thing
> I don't know is what common cache line sizes are on Arm or e.g. RISC-V.

Thanks. I found that structures that require cache-aligment are defined with
"__cacheline_aligned" that uses L1_CACHE_BYTES. For example, in x86, this
aligns to 128 bytes. What is the reason to use a higher value like 512 bytes or
1k?.

Thanks, Matias. 



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.