[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [PATCH] xen: core dom0 support

To: Ingo Molnar <mingo@xxxxxxx>
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Mon, 09 Mar 2009 11:06:40 -0700
Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, the arch/x86 maintainers <x86@xxxxxxxxxx>, Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>
Delivery-date: Mon, 09 Mar 2009 11:07:11 -0700
List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Ingo Molnar wrote:

* H. Peter Anvin <hpa@xxxxxxxxx> wrote:
Ingo Molnar wrote:
Since it's the same kernel image i think the only truly reliablemethod would be to reboot between _different_ kernel images:same instructions but randomly re-align variables both in termsof absolute address and in terms of relative position to eachother. Plus randomize bootmem allocs and never-gets-freed-reallyboot-time allocations.
Really hard to do i think ...
Ouch, yeah.
On the other hand, the numbers made sense to me, so I don'tsee why there is any reason to distrust them. They show a 5%overhead with pv_ops enabled, reduced to a 2% overhead withthe changed. That is more or less what would match myintuition from seeing the code.
Yeah - it was Jeremy expressed doubt in the numbers, not me.

Mainly because I was seeing the instruction and cycle counts completelyunchanged from run to run, which is implausible. They're not zero, sothey're clearly measurements of *something*, but not cycles andinstructions, since we know that they're changing. So what are theymeasurements of? And if they're not what they claim, are the othernumbers more meaningful?

It's easy to read the numbers as confirmations of preconceivedexpectations of the outcomes, but that's - as I said - unsatisfying.

And we need to eliminate that 2% as well - 2% is still an awfullot of native kernel overhead from a kernel feature that 95%+ ofusers do not make any use of.


Well, I think there's a few points here:

  1. the test in question is a bit vague about kernel and user
     measurements.  I assume the stuff coming from perfcounters is
     kernel-only state, but the elapsed time includes the usermode
     component, and so will be affected by the usermode page placement
     and cache effects.  If I change the test to copy the test
     executable (statically linked, to avoid libraries), then that
     should at least fuzz out user page placement.
  2. Its true that the cache effects could be due to the precise layout
     of the kernel executable; but if those effects are swamping
     effects of the changes to improve pvops then its unclear what the
     point of the exercise is.  Especially since:
  3. It is a config option, so if someone is sensitive to the
     performance hit and it gives them no useful functionality to
     offset it, then it can be disabled.  Distros tend to enable it
     because they tend to value function and flexibility over raw
     performance; they tend to enable things like audit, selinux,
     modules which all have performance hits of a similar scale (of
     course, you could argue that more people get benefit from those
     features to offset their costs).  But,
  4. I think you're underestimating the number of people who get
     benefit from pvops; the Xen userbase is actually pretty large, and
     KVM will use pvops hooks when available to improve Linux-as-guest.
  5. Also, we're looking at a single benchmark with no obvious
     relevance to a real workload.  Perhaps there are workloads which
     continuously mash mmap/munmap/mremap(!), but I think they're
     fairly rare.  Such a benchmark is useful for tuning specific
     areas, but if we're going to evaluate pvops overhead, it would be

nice to use something a bit broader to base our measurements on.Also, what weighting are we going to put on 32 vs 64 bit? Equally

     important?  One more than the other?

All that said, I would like to get the pvops overhead down tounmeasureable - the ideal would be to be able to justify removing theconfig option altogether and leave it always enabled.

The tradeoff, as always, is how much other complexity are we willing tostand to get there? The addition of a new calling convention is alreadyfairly esoteric, but so far it has got us a 60% reduction in overhead(in this test). But going further is going to get more complex.

For example, the next step would be to attack set_pte (includingset_pte_*, pte_clear, etc), to make them use the new calling convention,and possibly make them inlineable (ie, to get it as close as possible tothe non-pvops case). But that will require them to be implemented inasm (to guarantee that they only use the registers they're allowed touse), and we already have 3 variants of each for the different pagetablemodes. All completely doable, and not even very hard, but it will bejust one more thing to maintain - we just need to be sure the payoff isworth it.


   J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

Follow-Ups:
- [Xen-devel] Re: [PATCH] xen: core dom0 support
  - From: Nick Piggin
- [Xen-devel] Re: [PATCH] xen: core dom0 support
  - From: Ingo Molnar

References:
- [Xen-devel] Re: [PATCH] xen: core dom0 support
  - From: Ingo Molnar
- [Xen-devel] Re: [PATCH] xen: core dom0 support
  - From: Jeremy Fitzhardinge
- [Xen-devel] Re: [PATCH] xen: core dom0 support
  - From: Ingo Molnar
- [Xen-devel] Re: [PATCH] xen: core dom0 support
  - From: Ingo Molnar
- [Xen-devel] Re: [PATCH] xen: core dom0 support
  - From: Ingo Molnar

Prev by Date: Re: [Xen-devel] [PATCH] xen: mask XSAVE in cpuid since we don't allowguests to use it
Next by Date: Re: [Xen-devel][PATCH][RFC] _chk_fail and _chk canaries for minios and newlib
Previous by thread: [Xen-devel] Re: [PATCH] xen: core dom0 support
Next by thread: [Xen-devel] Re: [PATCH] xen: core dom0 support
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.