Xen project Mailing List

Re: [Xen-devel] [for-4.9] Re: HVM guest performance regression

From: Stefano Stabellini <sstabellini@xxxxxxxxxx>

Date: Tue, 6 Jun 2017 12:08:58 -0700 (PDT)

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>

Delivery-date: Tue, 06 Jun 2017 19:09:15 +0000

Dmarc-filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1C14523A02

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Tue, 6 Jun 2017, Juergen Gross wrote: > On 06/06/17 18:39, Stefano Stabellini wrote: > > On Tue, 6 Jun 2017, Juergen Gross wrote: > >> On 26/05/17 21:01, Stefano Stabellini wrote: > >>> On Fri, 26 May 2017, Juergen Gross wrote: > >>>> On 26/05/17 18:19, Ian Jackson wrote: > >>>>> Juergen Gross writes ("HVM guest performance regression"): > >>>>>> Looking for the reason of a performance regression of HVM guests under > >>>>>> Xen 4.7 against 4.5 I found the reason to be commit > >>>>>> c26f92b8fce3c9df17f7ef035b54d97cbe931c7a ("libxl: remove > >>>>>> freemem_slack") > >>>>>> in Xen 4.6. > >>>>>> > >>>>>> The problem occurred when dom0 had to be ballooned down when starting > >>>>>> the guest. The performance of some micro benchmarks dropped by about > >>>>>> a factor of 2 with above commit. > >>>>>> > >>>>>> Interesting point is that the performance of the guest will depend on > >>>>>> the amount of free memory being available at guest creation time. > >>>>>> When there was barely enough memory available for starting the guest > >>>>>> the performance will remain low even if memory is being freed later. > >>>>>> > >>>>>> I'd like to suggest we either revert the commit or have some other > >>>>>> mechanism to try to have some reserve free memory when starting a > >>>>>> domain. > >>>>> > >>>>> Oh, dear. The memory accounting swamp again. Clearly we are not > >>>>> going to drain that swamp now, but I don't like regressions. > >>>>> > >>>>> I am not opposed to reverting that commit. I was a bit iffy about it > >>>>> at the time; and according to the removal commit message, it was > >>>>> basically removed because it was a piece of cargo cult for which we > >>>>> had no justification in any of our records. > >>>>> > >>>>> Indeed I think fixing this is a candidate for 4.9. > >>>>> > >>>>> Do you know the mechanism by which the freemem slack helps ? I think > >>>>> that would be a prerequisite for reverting this. That way we can have > >>>>> an understanding of why we are doing things, rather than just > >>>>> flailing at random... > >>>> > >>>> I wish I would understand it. > >>>> > >>>> One candidate would be 2M/1G pages being possible with enough free > >>>> memory, but I haven't proofed this yet. I can have a try by disabling > >>>> big pages in the hypervisor. > >>> > >>> Right, if I had to bet, I would put my money on superpages shattering > >>> being the cause of the problem. > >> > >> Seems you would have lost your money... > >> > >> Meanwhile I've found a way to get the "good" performance in the micro > >> benchmark. Unfortunately this requires to switch off the pv interfaces > >> in the HVM guest via "xen_nopv" kernel boot parameter. > >> > >> I have verified that pv spinlocks are not to blame (via "xen_nopvspin" > >> kernel boot parameter). Switching to clocksource TSC in the running > >> system doesn't help either. > > > > What about xen_hvm_exit_mmap (an optimization for shadow pagetables) and > > xen_hvm_smp_init (PV IPI)? > > xen_hvm_exit_mmap isn't active (kernel message telling me so was > issued). > > >> Unfortunately the kernel seems no longer to be functional when I try to > >> tweak it not to use the PVHVM enhancements. > > > > I guess you are not talking about regular PV drivers like netfront and > > blkfront, right? > > The plan was to be able to use PV drivers without having to use PV > callbacks and PV timers. This isn't possible right now. I think the code to handle that scenario was gradually removed over time to simplify the code base. > >> I'm wondering now whether > >> there have ever been any benchmarks to proof PVHVM really being faster > >> than non-PVHVM? My findings seem to suggest there might be a huge > >> performance gap with PVHVM. OTOH this might depend on hardware and other > >> factors. > >> > >> Stefano, didn't you do the PVHVM stuff back in 2010? Do you have any > >> data from then regarding performance figures? > > > > Yes, I still have these slides: > > > > https://www.slideshare.net/xen_com_mgr/linux-pv-on-hvm > > Thanks. So you measured the overall package, not the single items like > callbacks, timers, time source? I'm asking because I start to believe > there are some of those slower than their non-PV variants. There isn't much left in terms of individual optimizations: you already tried switching clocksource and removing pv spinlocks. xen_hvm_exit_mmap is not used. Only the following are left (you might want to double check I haven't missed anything): 1) PV IPI 2) PV suspend/resume 3) vector callback 4) interrupt remapping 2) is not on the hot path. I did individual measurements of 3) at some points and it was a clear win. Slide 14 shows the individual measurements of 4) Only 1) is left to check as far as I can tell. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.