[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [for-4.9] Re: HVM guest performance regression

On Fri, 26 May 2017, Juergen Gross wrote:
> On 26/05/17 18:19, Ian Jackson wrote:
> > Juergen Gross writes ("HVM guest performance regression"):
> >> Looking for the reason of a performance regression of HVM guests under
> >> Xen 4.7 against 4.5 I found the reason to be commit
> >> c26f92b8fce3c9df17f7ef035b54d97cbe931c7a ("libxl: remove freemem_slack")
> >> in Xen 4.6.
> >>
> >> The problem occurred when dom0 had to be ballooned down when starting
> >> the guest. The performance of some micro benchmarks dropped by about
> >> a factor of 2 with above commit.
> >>
> >> Interesting point is that the performance of the guest will depend on
> >> the amount of free memory being available at guest creation time.
> >> When there was barely enough memory available for starting the guest
> >> the performance will remain low even if memory is being freed later.
> >>
> >> I'd like to suggest we either revert the commit or have some other
> >> mechanism to try to have some reserve free memory when starting a
> >> domain.
> > 
> > Oh, dear.  The memory accounting swamp again.  Clearly we are not
> > going to drain that swamp now, but I don't like regressions.
> > 
> > I am not opposed to reverting that commit.  I was a bit iffy about it
> > at the time; and according to the removal commit message, it was
> > basically removed because it was a piece of cargo cult for which we
> > had no justification in any of our records.
> > 
> > Indeed I think fixing this is a candidate for 4.9.
> > 
> > Do you know the mechanism by which the freemem slack helps ?  I think
> > that would be a prerequisite for reverting this.  That way we can have
> > an understanding of why we are doing things, rather than just
> > flailing at random...
> I wish I would understand it.
> One candidate would be 2M/1G pages being possible with enough free
> memory, but I haven't proofed this yet. I can have a try by disabling
> big pages in the hypervisor.

Right, if I had to bet, I would put my money on superpages shattering
being the cause of the problem.

> What makes the whole problem even more mysterious is that the
> regression was detected first with SLE12 SP3 (guest and dom0, Xen 4.9
> and Linux 4.4) against older systems (guest and dom0). While trying
> to find out whether the guest or the Xen version are the culprit I
> found that the old guest (based on kernel 3.12) showed the mentioned
> performance drop with above commit. The new guest (based on kernel
> 4.4) shows the same bad performance regardless of the Xen version or
> amount of free memory. I haven't found the Linux kernel commit yet
> being responsible for that performance drop.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.