[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [for-4.9] Re: HVM guest performance regression
On 30/05/17 12:43, Jan Beulich wrote: >>>> On 30.05.17 at 12:33, <jgross@xxxxxxxx> wrote: >> On 30/05/17 09:24, Jan Beulich wrote: >>>>>> On 29.05.17 at 21:05, <jgross@xxxxxxxx> wrote: >>>> Creating the domains with >>>> >>>> xl -vvv create ... >>>> >>>> showed the numbers of superpages and normal pages allocated for the >>>> domain. >>>> >>>> The following allocation pattern resulted in a slow domain: >>>> >>>> xc: detail: PHYSICAL MEMORY ALLOCATION: >>>> xc: detail: 4KB PAGES: 0x0000000000000600 >>>> xc: detail: 2MB PAGES: 0x00000000000003f9 >>>> xc: detail: 1GB PAGES: 0x0000000000000000 >>>> >>>> And this one was fast: >>>> >>>> xc: detail: PHYSICAL MEMORY ALLOCATION: >>>> xc: detail: 4KB PAGES: 0x0000000000000400 >>>> xc: detail: 2MB PAGES: 0x00000000000003fa >>>> xc: detail: 1GB PAGES: 0x0000000000000000 >>>> >>>> I ballooned dom0 down in small steps to be able to create those >>>> test cases. >>>> >>>> I believe the main reason is that some data needed by the benchmark >>>> is located near the end of domain memory resulting in a rather high >>>> TLB miss rate in case of not all (or nearly all) memory available in >>>> form of 2MB pages. >>> >>> Did you double check this by creating some other (persistent) >>> process prior to running your benchmark? I find it rather >>> unlikely that you would consistently see space from the top of >>> guest RAM allocated to your test, unless it consumes all RAM >>> that's available at the time it runs (but then I'd consider it >>> quite likely for overhead of using the few smaller pages to be >>> mostly hidden in the noise). >>> >>> Or are you suspecting some crucial kernel structures to live >>> there? >> >> Yes, I do. When onlining memory at boot time the kernel is using the new >> memory chunk to add the page structures and if needed new kernel page >> tables. It is normally allocating that memory from the end of the new >> chunk. > > The page tables are 4k allocations, sure. But the page structures > surely would be allocated with higher granularity? I'm really not sure. It might depend on the memory model (sparse, sparse vmemmap, flat). >>>>>> What makes the whole problem even more mysterious is that the >>>>>> regression was detected first with SLE12 SP3 (guest and dom0, Xen 4.9 >>>>>> and Linux 4.4) against older systems (guest and dom0). While trying >>>>>> to find out whether the guest or the Xen version are the culprit I >>>>>> found that the old guest (based on kernel 3.12) showed the mentioned >>>>>> performance drop with above commit. The new guest (based on kernel >>>>>> 4.4) shows the same bad performance regardless of the Xen version or >>>>>> amount of free memory. I haven't found the Linux kernel commit yet >>>>>> being responsible for that performance drop. >>>> >>>> And this might be result of a different memory usage of more recent >>>> kernels: I suspect the critical data is now at the very end of the >>>> domain's memory. As there are always some pages allocated in 4kB >>>> chunks the last pages of the domain will never be part of a 2MB page. >>> >>> But if the OS allocated large pages internally for relevant data >>> structures, those obviously won't come from that necessarily 4k- >>> mapped tail range. >> >> Sure? I think the kernel is using 1GB pages if possible for direct >> kernel mappings of the physical memory. It doesn't care for the last >> page mapping some space not populated. > > Are you sure? I would very much hope for Linux to not establish > mappings to addresses where no memory (and no MMIO) resides. > But I can't tell for sure for recent Linux versions; I do know in the > old days they were quite careful there. Looking at phys_pud_init() they are happily using 1GB pages until they have all memory mapped. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |