[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v2 1/2] xen/memory : Add a stats_table resource type


  • To: Matias Ezequiel Vara Larsen <matiasevara@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 14 Dec 2022 08:29:53 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9yzWEt4nyvMlvhXy7bhLnRWsozIpgvXnYXEUsGYqLXM=; b=GuReub7rqs2LdxTMLaqKQPqtAwqjhxLkdT0biOQpA1wL5HMdoP4miifAmZyjBponBGMeDbTyBPVQyiueTtjOSVRwCLfEz+BecXOYhAnL1zzZ8AIp2Y/kJx76GoSfIFOKAyhadltA3vnAsOTXhwZK2hBK5hH30PzuiSB55uCU4lrGvyFXp5cKg0+FjSwdAYb9ZIIiK2GpRG9QIKUG45BrBuKZtOxMaF8A71HPa9kNPLZmqftvVn1lm95r4bu2ZV8c0AWMZ/1v/migUtIAzFJkACWW0CG92Y6rXpjCyhcYH/ZrMfv2T/vD8g/growDEtcUjgqnLiWT2rr/kY+AYsdwDw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZPAdMpijnHPzE9Or1HQYf3Ib4ZowdMJZOCIE4go0s29Ot5rq61A6mWjGGnQcPWowHtbdq/VL78sOL0321HJvxliik2I3/16JW9YY8RuKodi3u0C7wDPqOJtGDHG2d0nqwNVo1+9ADAr/gkyxZ6FCQcBdxfNkA+RNrkoc8VCoxCwIJpPXSljZ7zOuuhW9Cxev29o9Rf4YZjANBFmCDeh9UeGJeFCIZZ3W9hGtVBLxBnHG1y1ssULD2DWDlFxLUcvy7x9vHXMEDK3vmHBVukhw2pFF99igHgrbVme31OpMxLMsf4vwKgC04rnei+Q+AIEsqrx1tx1JvXtv4/bGoHpB5g==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Matias Ezequiel Vara Larsen <matias.vara@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 14 Dec 2022 07:30:15 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 07.10.2022 14:39, Matias Ezequiel Vara Larsen wrote:
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -1078,6 +1078,12 @@ unsigned int ioreq_server_max_frames(const struct 
> domain *d)
>       return nr;
>  }
>  
> +unsigned int stats_table_max_frames(const struct domain *d)
> +{
> +    /* One frame per 512 vcpus. */
> +    return 1;
> +}

Beyond my earlier comment (and irrespective of this needing changing
anyway): I guess this "512" was not updated to match the now larger
size of struct vcpu_stats?

> +static int stats_vcpu_alloc_mfn(struct domain *d)
> +{
> +    struct page_info *pg;
> +
> +    pg = alloc_domheap_page(d, MEMF_no_refcount);

The ioreq and vmtrace resources are also allocated this way, but they're
HVM-specific. The one here being supposed to be VM-type independent, I'm
afraid such pages will be accessible by an "owning" PV domain (it'll
need to guess the MFN, but that's no excuse).

> +    if ( !pg )
> +        return -ENOMEM;
> +
> +    if ( !get_page_and_type(pg, d, PGT_writable_page) ) {
> +        put_page_alloc_ref(pg);
> +        return -ENODATA;
> +    }
> +
> +    d->vcpustats_page.va = __map_domain_page_global(pg);
> +    if ( !d->vcpustats_page.va )
> +        goto fail;
> +
> +    d->vcpustats_page.pg = pg;
> +    clear_page(d->vcpustats_page.va);

Beyond my earlier comment: I think that by the time the surrounding
hypercall returns the page ought to contain valid data. Otherwise I
see no way for the consumer to know from which point on the data is
going to be valid.

> @@ -287,6 +289,20 @@ static inline void vcpu_runstate_change(
>      }
>  
>      v->runstate.state = new_state;
> +
> +    vcpustats_va = (shared_vcpustatspage_t*)d->vcpustats_page.va;
> +    if ( vcpustats_va )
> +    {
> +     vcpustats_va->vcpu_info[v->vcpu_id].version =
> +         version_update_begin(vcpustats_va->vcpu_info[v->vcpu_id].version);
> +        smp_wmb();
> +        memcpy(&vcpustats_va->vcpu_info[v->vcpu_id].runstate_running_time,
> +               &v->runstate.time[RUNSTATE_running],
> +               sizeof(v->runstate.time[RUNSTATE_running]));
> +        smp_wmb();
> +        vcpustats_va->vcpu_info[v->vcpu_id].version =
> +            version_update_end(vcpustats_va->vcpu_info[v->vcpu_id].version);
> +    }

A further aspect to consider here is cache line ping-pong. I think the
per-vCPU elements of the array want to be big enough to not share a
cache line. The interface being generic this presents some challenge
in determining what the supposed size is to be. However, taking into
account the extensibility question, maybe the route to take is to
simply settle on a power-of-2 value somewhere between x86'es and Arm's
cache line sizes and the pretty common page size of 4k, e.g. 512 bytes
or 1k?

> --- a/xen/include/public/vcpu.h
> +++ b/xen/include/public/vcpu.h
> @@ -235,6 +235,22 @@ struct vcpu_register_time_memory_area {
>  typedef struct vcpu_register_time_memory_area 
> vcpu_register_time_memory_area_t;
>  DEFINE_XEN_GUEST_HANDLE(vcpu_register_time_memory_area_t);
>  
> +struct vcpu_stats{
> +    /* If the least-significant bit of the version number is set then an 
> update
> +     * is in progress and the guest must wait to read a consistent set of 
> values
> +     * This mechanism is similar to Linux's seqlock.
> +     */
> +    uint32_t version;
> +    uint32_t pad0;
> +    uint64_t runstate_running_time;
> +};
> +
> +struct shared_vcpustatspage {
> +    struct vcpu_stats vcpu_info[1];
> +};
> +
> +typedef struct shared_vcpustatspage shared_vcpustatspage_t;

For new additions please avoid further name space issues: All types
and alike want to be prefixed by "xen_".

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.