Xen project Mailing List

[Xen-devel] Analysis of using balloon page compaction in Xen balloon driver

This document analyses the impact of using balloon compaction infrastructure in Xen balloon driver. ## Motives 1. Balloon pages fragments guest physical address space. 2. Balloon compaction infrastructure can migrate ballooned pages from start of zone to end of zone, hence creating contiguous guest physical address space. 3. Having contiguous guest physical address enables some options to improve performance. ## Benefit for auto-translated guest HVM/PVH/ARM guest can have contiguous guest physical address space after balloon pages are compacted, which potentially improves memory performance provided guest makes use of huge pages, either via Hugetlbfs or Transparent Huge Page (THP). Consider memory access pattern of these guests, one access to guest physical address involves several accesses to machine memory. The total number of memory accesses can be represented as: > X = H1 * G1 + H2 * G2 + ... + Hn * Gn + 1 Hx denotes second stage page table walk levels and Gx denotes guest page table walk levels. By having contiguous guest physical address, guest can make use of huge pages. This can reduce the number of G's in formula. Reducing number of H's is another project for hypervisor side improvement and should be decoupled from Linux side changes. ## Design and implementation The use of balloon compaction doesn't require introducign new interfaces between Xen balloon driver and the rest of the system. Most changes are internal to Xen balloon driver. Currently, Xen balloon driver gets its page directly from page allocator. To enable balloon page migration, those pages now need to be allocated from core balloon driver. Pages allocated from core balloon driver are subject to balloon page compaction. Xen balloon driver will also need to provide a callback to migrate balloon page. In essence callback function receives "old page", which is a already ballooned out page, and "new page", which is a page to be ballooned out, then it inflates "old page" and deflates "new page". The core of migration callback is XENMEM\_exchange hypercall. This makes sure that inflation of old page and deflation of new page is done atomically, so even if a domain is beyond its memory target and being enforced, it can still compact memory. ## HAP table fragmentation is not made worse *Assumption*: guest physical address space is already heavily fragmented by balloon pages when balloon page compaction is required. For a typical test case like ballooning up and down when doing kernel compilation, there's usually only a handful huge pages left in the end. So the observation matches the assumption. On the other hand, if guest physical address space is not heavily fragmented, it's not likely balloon page compaction will be triggered automatically. In practice, balloon page compaction is not likely to make things worse. Here is the analysis based on the above assumption. Note that HAP table is already shattered by balloon pages. When a guest page is ballooned out, the underlying HAP entry needs to be split should that entry pointed to a huge page. XENMEM\_exchange works as followed, "old page" is the guest page about to get inflated and "new page" is the guest page about to get deflated. It works like this: 1. Steal old page from domain. 2. Allocate a heap page from domheap 3. Release new page back to Xen 4. Update guest physmap, old page points to heap page, new page points to INVALID\_MFN. The end result is that HAP entry for "old page" now points to a valid MFN instead of having INVALID\_MFN; HAP entry for "new page" now points to INVALID\_MFN. So for old page we're in the same position as before. HAP table is fragmented, however it's not more fragmented than before. For new page, the risk is that if the targeting guest new page is part of a huge page, we need to split HAP entry, hence fragmenting HAP table. This is valid concern. However in practice, guest address space is already fragmented by ballooning. It's not likely we need to break up any more huge pages, because there aren't that many left. So we're in a position no worse than before. Another downside is that when Xen is exchanging a page, it's possible that Xen may need to break up a huge page to get a 4K page. Xen domheap is fragmented. However we're not getting any worse than before as ballooning already fragments domheap. ## Beyond Linux balloon compaction infrastructure Currently there's no mechanism in Xen to coalesce HAP table entries. To coalesce HAP entries we would need to make sure all discrete entries belong to one huge page, are in correct order and correct state. By introducing necessary infrastructure(s) inside hypervisor (page migration etc.), we might eventually be able to coalesce HAP entries, hence reducing the number of H's in the aforementioned formula. This, combined with the work on guest side, can help guest achieve best possible performance. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.