[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1 of 2] xenalyze: decode PV_HYPERCALL_V2 records
On Mon, Oct 1, 2012 at 6:52 PM, David Vrabel <david.vrabel@xxxxxxxxxx> wrote: > Newer version of Xen produce TRC_PV_HYPERCALL_V2 records instead of > the older TRC_PV_HYPERCALL format. This updated format doesn't > included the IP but it does include select hypercall arguments. > > Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx> Thanks -- I did a bunch of optimization work on xenalyze late last year, so I'm afraid I'm going to be pretty picky about some things to avoid losing that effort. :-) > diff --git a/pv.h b/pv.h > new file mode 100644 > --- /dev/null > +++ b/pv.h > @@ -0,0 +1,41 @@ > +/* > + * PV event decoding. > + * > + * Copyright (C) 2012 Citrix Systems R&D Ltd. > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + */ > +#ifndef __PV_H > + > +#include "analyze.h" > +#include "trace.h" > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#define ARG_MISSING 0x0 > +#define ARG_32BIT 0x1 > +#define ARG_64BIT 0x2 > + > +#define MMU_UPDATE_PREEMPTED (~(~0U>>1)) > + > +static inline uint32_t pv_hypercall_op(const struct record_info *ri) > +{ > + return ri->d[0] & ~TRC_PV_HYPERCALL_V2_ARG_MASK; > +} > + > +static inline int pv_hypercall_arg_present(const struct record_info *ri, int > arg) > +{ > + return (ri->d[0] >> (20 + 2*arg)) & 0x3; > +} Try to avoid an integer multiply here; (arg<<1) or (arg+arg) please. > @@ -6523,6 +6537,168 @@ void pv_summary(struct pv_data *pv) { > } > } > > +uint64_t pv_hypercall_arg(const struct record_info *ri, int arg) > +{ > + int i, word; > + > + for (i = 0, word = 1; i < 6 && word < ri->extra_words; i++) { > + int present = pv_hypercall_arg_present(ri, i); > + > + /* Is this the argument we're looking for? */ > + if (i == arg) { > + switch (present) { > + case ARG_MISSING: > + return 0; > + case ARG_32BIT: > + return ri->d[word]; > + case ARG_64BIT: > + return ((uint64_t)ri->d[word + 1] << 32) | ri->d[word]; > + } > + } > + > + /* Skip over any words for this argument. */ > + word += present; > + } So at the moment, this is only called if you're in dump mode, which is fairly slow already; but it may at some point be called in summary mode, which I'd like to keep pretty fast. This little loop will basically be O(n^2) with the number of recorded arguments. Since we're expecting the hypercall arguments xenalyze asks for to match the arguments put into the trace, it seems like we should be able to do better. Would it make sense to do something like pass in an array of what arguments it expects to get, and then have the function double-check that they match properly, and then replaces the "expected argument number" with "traced value" (returning an error if they don't match)? Alternately, since pv_hypercall_arg() returns 0 for a non-existent arg anyway, couldn't you have an array, initialized to 0, and filled in as appropriate? > +static void pv_hypercall_print_args(const struct record_info *ri) > +{ > + int i, word; > + > + for (i = 0, word = 1; i < 6 && word < ri->extra_words; i++) { > + int present = pv_hypercall_arg_present(ri, i); > + > + switch (present) { > + case ARG_MISSING: > + printf(" ??"); > + break; > + case ARG_32BIT: > + printf(" %08x", ri->d[word]); > + break; > + case ARG_64BIT: > + printf(" %016"PRIu64"", ((uint64_t)ri->d[word + 1] << 32) | > ri->d[word]); > + break; > + } > + > + word += present; > + } > +} > + > +static const char *grant_table_op_cmd_to_str(uint32_t cmd) > +{ > + const char *cmd_str[] = { > + "map_grant_ref", "unmap_grant_ref", "setup_table", "dump_table", > + "transfer", "copy", "query_size", "unmap_and_replace", > + "set_version", "get_status_frames", "get_version", "swap_grant_ref", > + }; > + static char buf[32]; > + > + if (cmd <= 11) > + return cmd_str[cmd]; > + > + snprintf(buf, sizeof(buf), "unknown (%d)", cmd); Just FYI, in my analysis, time spent in libc's vprintf() takes up the vast majority of the time of xenalyze's dump mode -- even just doing things like printing %s (which you would think would be more or less a strcpy). In this case (and the others in this patch) the sprintf() should be the uncommon case, so they're fine. But just a heads-up that I'll be picky about sprintfs() in hot paths. :-) Thanks, -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |