[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] initial ballooning amount on HVM+PoD
On 01/17/2014 11:03 AM, Jan Beulich wrote: On 17.01.14 at 16:54, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> wrote:On 01/17/2014 09:33 AM, Jan Beulich wrote:While looking into JÃrgen's issue with PoD setup causing soft lockups in Dom0 I realized that what I did in linux-2.6.18-xen.hg's c/s 989:a7781c0a3b9a ("xen/balloon: fix balloon driver accounting for HVM-with-PoD case") just doesn't work - the BUG_ON() added there triggers as soon as there's a reasonable amount of excess memory. And that is despite me knowing that I spent significant amounts of in testing that change - I must have tested something else than finally got checked in, or must have screwed up in some other way. Extremely embarrassing... In the course of finding a proper solution I soon stumbled across upstream's c275a57f5e ("xen/balloon: Set balloon's initial state to number of existing RAM pages"), and hence went ahead and compared three different calculations for initial bs.current_pages: (a) upstream's (open coding get_num_physpages(), as I did this on an older kernel) (b) plain old num_physpages (equaling the maximum RAM PFN) (c) XENMEM_get_pod_target output (with the hypervisor altered to not refuse this for a domain doing it on itself) The fourth (original) method, using totalram_pages, was already known to result in the driver not ballooning down enough, and hence setting up the domain for an eventual crash when the PoD cache runs empty. Interestingly, (a) too results in the driver not ballooning down enough - there's a gap of exactly as many pages as are marked reserved below the 1Mb boundary. Therefore aforementioned upstream commit is presumably broken. Short of a reliable (and ideally architecture independent) way of knowing the necessary adjustment value, the next best solution (not ballooning down too little, but also not ballooning down much more than necessary) turns out to be using the minimum of (b) and (c): When the domain only has memory below 4Gb, (b) is more precise, whereas in the other cases (c) gets closest.I am not sure I understand why (b) would be the right answer for less-than-4G guests. The reason for c275a57f5e patch was that max_pfn includes MMIO space (which is not RAM) and thus the driver will unnecessarily balloon down that much memory.max_pfn/num_physpages isn't that far off for guest with less than 4Gb, the number calculated from the PoD data is a little worse. For a 4G guest it's 65K pages that are ballooned down so it's not insignificant. And it you are increasing MMIO size (something that we had to do here) it gets progressively worse. Question now is: Considering that (a) is broken (and hard to fix) and (b) is in presumably a large part of practical cases leading to too much ballooning down, shouldn't we open up XENMEM_get_pod_target for domains to query on themselves? Alternatively, can anyone see another way to calculate a reasonably precise value?I think hypervisor query is a good thing although I don't know whether exposing PoD-specific data (count and entry_count) to the guest is necessary. It's probably OK (or we can set these fields to zero for non-privileged domains).That's pointless then - if no useful data is provided through the call to non-privileged domains, we can as well keep it erroring for them. I thought that are after d->tot_pages, no? -boris _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |