|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC PATCH 1/2] xen/memory : Add stats_table resource type
On Fri, Jun 17, 2022 at 08:54:34PM +0000, George Dunlap wrote:
> Preface: It looks like this may be one of your first hypervisor patches. So
> thank you! FYI there’s a lot that needs fixing up here; please don’t read a
> tone of annoyance into it, I’m just trying to tell you what needs fixing in
> the most efficient manner. :-)
>
Thanks for the comments :) I have addressed some of them already in the
response to the
cover letter.
> > On 17 May 2022, at 15:33, Matias Ezequiel Vara Larsen
> > <matiasevara@xxxxxxxxx> wrote:
> >
> > Allow to map vcpu stats using acquire_resource().
>
> This needs a lot more expansion in terms of what it is that you’re doing in
> this patch and why.
>
Got it. I'll improve the commit message in v1.
> >
> > Signed-off-by: Matias Ezequiel Vara Larsen <matias.vara@xxxxxxxx>
> > ---
> > xen/common/domain.c | 42 +++++++++++++++++++++++++++++++++++++
> > xen/common/memory.c | 29 +++++++++++++++++++++++++
> > xen/common/sched/core.c | 5 +++++
> > xen/include/public/memory.h | 1 +
> > xen/include/xen/sched.h | 5 +++++
> > 5 files changed, 82 insertions(+)
> >
> > diff --git a/xen/common/domain.c b/xen/common/domain.c
> > index 17cc32fde3..ddd9f88874 100644
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -132,6 +132,42 @@ static void vcpu_info_reset(struct vcpu *v)
> > v->vcpu_info_mfn = INVALID_MFN;
> > }
> >
> > +static void stats_free_buffer(struct vcpu * v)
> > +{
> > + struct page_info *pg = v->stats.pg;
> > +
> > + if ( !pg )
> > + return;
> > +
> > + v->stats.va = NULL;
> > +
> > + if ( v->stats.va )
> > + unmap_domain_page_global(v->stats.va);
> > +
> > + v->stats.va = NULL;
>
> Looks like you meant to delete the first `va->stats.va = NULL` but forgot?
>
Apologies for this, I completely missed.
> > +
> > + free_domheap_page(pg);
>
> Pretty sure this will crash.
>
> Unfortunately page allocation and freeing is somewhat complicated and
> requires a bit of boilerplate. You can look at the vmtrace_alloc_buffer()
> code for a template, but the general sequence you want is as follows:
>
> * On the allocate side:
>
> 1. Allocate the page
>
> pg = alloc_domheap_page(d, MEMF_no_refcount);
>
> This will allocate a page with the PGC_allocated bit set and a single
> reference count. (If you pass a page with PGC_allocated set to
> free_domheap_page(), it will crash; which is why I said the above.)
>
> 2. Grab a general reference count for the vcpu struct’s reference to it; as
> well as a writable type count, so that it doesn’t get used as a special page
>
> if (get_page_and_type(pg, d, PGT_writable_page)) {
> put_page_alloc_ref(p);
> /* failure path */
> }
>
> * On the free side, don’t call free_domheap_pages() directly. Rather, drop
> the allocation, then drop your own type count, thus:
>
> v->stats.va <http://stats.va/> = NULL;
>
> put_page_alloc_ref(pg);
> put_page_and_type(pg);
>
> The issue here is that we can’t free the page until all references have
> dropped; and the whole point of this exercise is to allow guest userspace in
> dom0 to gain a reference to the page. We can’t actually free the page until
> *all* references are gone, including the userspace one in dom0. The
> put_page() which brings the reference count to 0 will automatically free the
> page.
>
Thanks for the explanation. For some reason, this is not crashing. I will
reimplement the allocation, releasing, and then update the documentation that I
proposed at
https://lists.xenproject.org/archives/html/xen-devel/2022-05/msg01963.html. The
idea of this reference document is to guide someone that would like to export a
new
resource by relying on the acquire-resource interface.
>
> > +}
> > +
> > +static int stats_alloc_buffer(struct vcpu *v)
> > +{
> > + struct domain *d = v->domain;
> > + struct page_info *pg;
> > +
> > + pg = alloc_domheap_page(d, MEMF_no_refcount);
> > +
> > + if ( !pg )
> > + return -ENOMEM;
> > +
> > + v->stats.va = __map_domain_page_global(pg);
> > + if ( !v->stats.va )
> > + return -ENOMEM;
> > +
> > + v->stats.pg = pg;
> > + clear_page(v->stats.va);
> > + return 0;
> > +}
>
> The other thing to say about this is that the memory is being allocated
> unconditionally, even if nobody is planning to read it. The vast majority of
> Xen users will not be running xcp-rrd, so it will be pointless overheard.
>
> At a basic level, you want to follow suit with the vmtrace buffers, which are
> only allocated if the proper domctl flag is set on domain creation. (We
> could consider instead, or in addition, having a global Xen command-line
> parameter which would enable this feature for all domains.)
>
I agree. I will add a domctl flag on domain creation to enable the allocation of
these buffers.
> > +
> > static void vmtrace_free_buffer(struct vcpu *v)
> > {
> > const struct domain *d = v->domain;
> > @@ -203,6 +239,9 @@ static int vmtrace_alloc_buffer(struct vcpu *v)
> > */
> > static int vcpu_teardown(struct vcpu *v)
> > {
> > +
> > + stats_free_buffer(v);
> > +
> > vmtrace_free_buffer(v);
> >
> > return 0;
> > @@ -269,6 +308,9 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int
> > vcpu_id)
> > if ( vmtrace_alloc_buffer(v) != 0 )
> > goto fail_wq;
> >
> > + if ( stats_alloc_buffer(v) != 0 )
> > + goto fail_wq;
> > +
> > if ( arch_vcpu_create(v) != 0 )
> > goto fail_sched;
> >
> > diff --git a/xen/common/memory.c b/xen/common/memory.c
> > index 297b98a562..39de6d9d05 100644
> > --- a/xen/common/memory.c
> > +++ b/xen/common/memory.c
> > @@ -1099,6 +1099,10 @@ static unsigned int resource_max_frames(const struct
> > domain *d,
> > case XENMEM_resource_vmtrace_buf:
> > return d->vmtrace_size >> PAGE_SHIFT;
> >
> > + // WIP: to figure out the correct size of the resource
> > + case XENMEM_resource_stats_table:
> > + return 1;
> > +
> > default:
> > return -EOPNOTSUPP;
> > }
> > @@ -1162,6 +1166,28 @@ static int acquire_vmtrace_buf(
> > return nr_frames;
> > }
> >
> > +static int acquire_stats_table(struct domain *d,
> > + unsigned int id,
> > + unsigned int frame,
> > + unsigned int nr_frames,
> > + xen_pfn_t mfn_list[])
> > +{
> > + const struct vcpu *v = domain_vcpu(d, id);
> > + mfn_t mfn;
> > +
>
> Maybe I’m paranoid, but I might add an “ASSERT(nr_frames == 1)” here
>
Thanks, I will have this in mind in v1.
> > + if ( !v )
> > + return -ENOENT;
> > +
> > + if ( !v->stats.pg )
> > + return -EINVAL;
> > +
> > + mfn = page_to_mfn(v->stats.pg);
> > + mfn_list[0] = mfn_x(mfn);
> > +
> > + printk("acquire_perf_table: id: %d, nr_frames: %d, %p, domainid:
> > %d\n", id, nr_frames, v->stats.pg, d->domain_id);
> > + return 1;
> > +}
> > +
> > /*
> > * Returns -errno on error, or positive in the range [1, nr_frames] on
> > * success. Returning less than nr_frames contitutes a request for a
> > @@ -1182,6 +1208,9 @@ static int _acquire_resource(
> > case XENMEM_resource_vmtrace_buf:
> > return acquire_vmtrace_buf(d, id, frame, nr_frames, mfn_list);
> >
> > + case XENMEM_resource_stats_table:
> > + return acquire_stats_table(d, id, frame, nr_frames, mfn_list);
> > +
> > default:
> > return -EOPNOTSUPP;
> > }
> > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> > index 8f4b1ca10d..2a8b534977 100644
> > --- a/xen/common/sched/core.c
> > +++ b/xen/common/sched/core.c
> > @@ -264,6 +264,7 @@ static inline void vcpu_runstate_change(
> > {
> > s_time_t delta;
> > struct sched_unit *unit = v->sched_unit;
> > + uint64_t * runstate;
> >
> > ASSERT(spin_is_locked(get_sched_res(v->processor)->schedule_lock));
> > if ( v->runstate.state == new_state )
> > @@ -287,6 +288,10 @@ static inline void vcpu_runstate_change(
> > }
> >
> > v->runstate.state = new_state;
> > +
> > + // WIP: use a different interface
> > + runstate = (uint64_t*)v->stats.va;
>
> I think you should look at
> xen.git/xen/include/public/hvm/ioreq.h:shared_iopage_t for inspiration.
> Basically, you cast the void pointer to that type, and then just access
> `iopage->vcpu_ioreq[vcpuid]`. Put it in a public header, and then both the
> userspace consumer and Xen can use the same structure.
>
Thanks for pointing it out. I've addresed this comment in the response to the
cover
letter.
> As I said in my response to the cover letter, I think vcpu_runstate_info_t
> would be something to look at and gain inspiration from.
>
> The other thing to say here is that this is a hot path; we don’t want to be
> copying lots of information around if it’s not going to be used. So you
> should only allocate the buffers if specifically enabled, and you should only
> copy the information over if v->stats.va <http://stats.va/> != NULL.
>
> I think that should be enough to start with; we can nail down more when you
> send v1.
>
Thanks for the comments, I will tackle them in v1.
Matias
> Peace,
> -George
>
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |