[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v2 1/2] xen/memory : Add a stats_table resource type



On Wed, Dec 14, 2022 at 08:29:53AM +0100, Jan Beulich wrote:
> On 07.10.2022 14:39, Matias Ezequiel Vara Larsen wrote:
> > --- a/xen/common/memory.c
> > +++ b/xen/common/memory.c
> > @@ -1078,6 +1078,12 @@ unsigned int ioreq_server_max_frames(const struct 
> > domain *d)
> >       return nr;
> >  }
> >  
> > +unsigned int stats_table_max_frames(const struct domain *d)
> > +{
> > +    /* One frame per 512 vcpus. */
> > +    return 1;
> > +}
> 
> Beyond my earlier comment (and irrespective of this needing changing
> anyway): I guess this "512" was not updated to match the now larger
> size of struct vcpu_stats?

In the next series, I am calculating the number of frames by:

nr = DIV_ROUND_UP(d->max_vcpus * sizeof(struct vcpu_stats), PAGE_SIZE);

> 
> > +static int stats_vcpu_alloc_mfn(struct domain *d)
> > +{
> > +    struct page_info *pg;
> > +
> > +    pg = alloc_domheap_page(d, MEMF_no_refcount);
> 
> The ioreq and vmtrace resources are also allocated this way, but they're
> HVM-specific. The one here being supposed to be VM-type independent, I'm
> afraid such pages will be accessible by an "owning" PV domain (it'll
> need to guess the MFN, but that's no excuse).
> 
> > +    if ( !pg )
> > +        return -ENOMEM;
> > +
> > +    if ( !get_page_and_type(pg, d, PGT_writable_page) ) {
> > +        put_page_alloc_ref(pg);
> > +        return -ENODATA;
> > +    }
> > +
> > +    d->vcpustats_page.va = __map_domain_page_global(pg);
> > +    if ( !d->vcpustats_page.va )
> > +        goto fail;
> > +
> > +    d->vcpustats_page.pg = pg;
> > +    clear_page(d->vcpustats_page.va);
> 
> Beyond my earlier comment: I think that by the time the surrounding
> hypercall returns the page ought to contain valid data. Otherwise I
> see no way for the consumer to know from which point on the data is
> going to be valid.
> 
> > @@ -287,6 +289,20 @@ static inline void vcpu_runstate_change(
> >      }
> >  
> >      v->runstate.state = new_state;
> > +
> > +    vcpustats_va = (shared_vcpustatspage_t*)d->vcpustats_page.va;
> > +    if ( vcpustats_va )
> > +    {
> > +   vcpustats_va->vcpu_info[v->vcpu_id].version =
> > +       version_update_begin(vcpustats_va->vcpu_info[v->vcpu_id].version);
> > +        smp_wmb();
> > +        memcpy(&vcpustats_va->vcpu_info[v->vcpu_id].runstate_running_time,
> > +               &v->runstate.time[RUNSTATE_running],
> > +               sizeof(v->runstate.time[RUNSTATE_running]));
> > +        smp_wmb();
> > +        vcpustats_va->vcpu_info[v->vcpu_id].version =
> > +            
> > version_update_end(vcpustats_va->vcpu_info[v->vcpu_id].version);
> > +    }
> 
> A further aspect to consider here is cache line ping-pong. I think the
> per-vCPU elements of the array want to be big enough to not share a
> cache line. The interface being generic this presents some challenge
> in determining what the supposed size is to be. However, taking into
> account the extensibility question, maybe the route to take is to
> simply settle on a power-of-2 value somewhere between x86'es and Arm's
> cache line sizes and the pretty common page size of 4k, e.g. 512 bytes
> or 1k?
> 

I do not now how to address this. I was thinking to align each vcpu_stats
instance to a multiple of the cache-line. I would pick up the first multiple
that is bigger to the size of the vcpu_stats structure. For example, currently
the structure is 16 bytes so I would align each instance in a frame to 64
bytes. Would it make sense? 

> > --- a/xen/include/public/vcpu.h
> > +++ b/xen/include/public/vcpu.h
> > @@ -235,6 +235,22 @@ struct vcpu_register_time_memory_area {
> >  typedef struct vcpu_register_time_memory_area 
> > vcpu_register_time_memory_area_t;
> >  DEFINE_XEN_GUEST_HANDLE(vcpu_register_time_memory_area_t);
> >  
> > +struct vcpu_stats{
> > +    /* If the least-significant bit of the version number is set then an 
> > update
> > +     * is in progress and the guest must wait to read a consistent set of 
> > values
> > +     * This mechanism is similar to Linux's seqlock.
> > +     */
> > +    uint32_t version;
> > +    uint32_t pad0;
> > +    uint64_t runstate_running_time;
> > +};
> > +
> > +struct shared_vcpustatspage {
> > +    struct vcpu_stats vcpu_info[1];
> > +};
> > +
> > +typedef struct shared_vcpustatspage shared_vcpustatspage_t;
> 
> For new additions please avoid further name space issues: All types
> and alike want to be prefixed by "xen_".
>

Should I name it "xen_shared_vcpustatspage_t" instead?

Matias 



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.