[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v10 09/20] x86/VPMU: Add public xenpmu.h




On 09/15/2014 07:56 AM, Konrad Rzeszutek Wilk wrote:
On Fri, Sep 12, 2014 at 11:18:48AM -0400, Boris Ostrovsky wrote:
On 09/12/2014 10:38 AM, Jan Beulich wrote:
On 12.09.14 at 16:21, <boris.ostrovsky@xxxxxxxxxx> wrote:
On 09/12/2014 02:50 AM, Jan Beulich wrote:
Okay, so you're tailoring the hypervisor interface to Linux. That's
not what we generally aim for, and hence I continue to think this
isn't the right approach.
But we know that at least one consumer of this interface (Linux/perf)
does want to have full GPR set.
You said they use a struct pt_regs somewhere - that's not the same
as struct cpu_user_regs (and the overlap of registers being in both
is not much more than a coincidence), and you didn't clarify yet how
much of this information is _actually_ getting consumed (i.e. this may
well just happen to be a suitable container for conveying a smaller
set).
I don't know how all registers are consumed in Linux code (except for RIP
and CS) and I am not sure it's all that important: we know that we need to
pass GPRs to perf infrastructure which, for the most part, is a black box
(especially given the insane amount of code churn that happens in perf).

If and when we find out that something that perf wants is not provided in
pt_regs (converted from cpu_user_regs) then we will have to deal with it.
But it's not much different from what would happen if we use specific
registers in the shared structure.

So why not provide it? Especially since
(I suspect that) doing memcpy may be faster than copying only selected
fields.

If you are categorically against this I can certainly rework this and
pass RIP, RSP, CS and EFLAGS only.
My problem with this is that (a) it exposes an arbitrary subset of
registers and (b) in an incompatible way (a 32-bit profiling domain
should very well be told all 64-bit registers of a 64-bit guest it
profiles). Since b would need addressing by a custom structure
anyway, dealing with a at the same time seems reasonable to me.
Note that (b) is not supported --- 32-bit domains can only profile
themselves and not the hypervisor (or other domains).
And that is due to the tools limitation of not being able to understand
64-bit values right?.

Right. Specifically, in perf case it will be very surprised to see 64-bit samples (in fact, it won't see them as 64-bit since registers that it sees are 32-bit).



But of course it would help if other interested parties would voice
their view of this...
My thinking (with the release manager hat) is thishat:

  1). Adding in the support for 32-bit consumer is nice but won't be
      used that often (64-bit consumer with 32-bit 'producer' - as in
      32-bit guests - will surely be present).

Profiling hypervisor from 32-bit dom0 is not planned at all. This will require new toolset.


  2). Having the right ABI for these perf counters is hard across
      two set of vendors (and what if we want to expand this for ARM?).

Intel/AMD, I think/hope, is taken care of.

ARM is now stubbed out, most of the code is in x86-specific directory.

-boris



  3). The one 'generic' profiling tool that worked across all OSes
      was 'oprofile' which has hit the dustpan. We still support
      it with the oprofile hypercalls. Doing something generic
      might not be the best way going forward.

  4). The ABI should have means to look in the future in case things
      change (and then there is a method to provide the version information
      in case the ABI alters). Future version of the Intel
      or AMD performance counters architecture might expand the
      amount of registers they expose. Or they might not.
      Either way, a new minor rev would have to be increased to
      benefit from this.

  5). Exposing only CS, RIP, and EFLAGS seems to be tailoring the
      interface for Linux perf use case.

  6). Exposing all of the GPRs, while not needed for the current
      set of Linux patches, could allow other tools to use this
      (Intel's tracing tool for example - which uses the perf
      ABI). But surely there are tools on FreeBSD, NetBSD that
      do profiling.


If I understand Jan's concerns correctly, the question is -
if we are not using all of the GPRs right now in the tools
why even bother expanding it? And if the new version of
the architecture does expose it -  and we have the support
- then we can expand the page structure? (and of course
rev the major).

While Boris's view is that the condition above will happen -
we will expand the registers/use case - so why do the
intermediate step of expanding the page structure - when we can
make the structure bigger now (and rev the minor).

Is that about right?


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.