[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen/page_alloc: Simplify domain_adjust_tot_pages
On 05.03.2025 14:22, Alejandro Vallejo wrote: > On Wed Mar 5, 2025 at 10:49 AM GMT, Jan Beulich wrote: >> On 27.02.2025 15:36, Alejandro Vallejo wrote: >>> On Wed Feb 26, 2025 at 2:05 PM GMT, Jan Beulich wrote: >>>> On 24.02.2025 15:49, Alejandro Vallejo wrote: >>>>> Open question to whoever reviews this... >>>>> >>>>> On Mon Feb 24, 2025 at 1:27 PM GMT, Alejandro Vallejo wrote: >>>>>> spin_lock(&heap_lock); >>>>>> - /* adjust domain outstanding pages; may not go negative */ >>>>>> - dom_before = d->outstanding_pages; >>>>>> - dom_after = dom_before - pages; >>>>>> - BUG_ON(dom_before < 0); >>>>>> - dom_claimed = dom_after < 0 ? 0 : dom_after; >>>>>> - d->outstanding_pages = dom_claimed; >>>>>> - /* flag accounting bug if system outstanding_claims would go >>>>>> negative */ >>>>>> - sys_before = outstanding_claims; >>>>>> - sys_after = sys_before - (dom_before - dom_claimed); >>>>>> - BUG_ON(sys_after < 0); >>>>>> - outstanding_claims = sys_after; >>>>>> + BUG_ON(outstanding_claims < d->outstanding_pages); >>>>>> + if ( pages > 0 && d->outstanding_pages < pages ) >>>>>> + { >>>>>> + /* `pages` exceeds the domain's outstanding count. Zero it out. >>>>>> */ >>>>>> + outstanding_claims -= d->outstanding_pages; >>>>>> + d->outstanding_pages = 0; >>>>> >>>>> While this matches the previous behaviour, do we _really_ want it? It's >>>>> weird, >>>>> quirky, and it hard to extend to NUMA-aware claims (which is something in >>>>> midway through). >>>>> >>>>> Wouldn't it make sense to fail the allocation (earlier) if the claim has >>>>> run >>>>> out? Do we even expect this to ever happen this late in the allocation >>>>> call >>>>> chain? >>>> >>>> This goes back to what a "claim" means. Even without any claim, a domain >>>> may >>>> allocate memory. So a claim having run out doesn't imply allocation has to >>>> fail. >>> >>> Hmmm... but that violates the purpose of the claim infra as far as I >>> understand >>> it. If a domain may overallocate by (e.g) ballooning in memory it can >>> distort the >>> ability of another domain to start up, even if it succeeded in its own >>> claim. >> >> Why would that be? As long as we hold back enough memory to cover the claim, >> it >> shouldn't matter what kind of allocation we want to process. I'd say that a >> PV >> guest starting ballooned ought to be able to deflate its balloon as far as >> there was a claim established for it up front. > > The fact a domain allocated something does not mean it had it claimed before. > A > claim is a restriction to others' ability to allocate/claim, not to the domain > that made the claim. Yes and no. No in so far as the target's "ability to allocate" may or may not be meant to extend beyond domain creation. > e.g: > > 0. host is idle. No domU. > 1. dom1 is created with a claim to 10% of host memory and max_mem of 80% of > host meomory. > 2. dom1 balloons in 70% of host memory. > 3. dom1 ballons out 70% of host memory. > 4. dom1 now holds a a claim to 80% of host memory. Sure, this shouldn't be the result. Yet that may merely be an effect of claim accounting being insufficient. > It's all quite perverse. Fortunately, looking at adjacent claims-related code > xl seems to default to making a claim prior to populating the physmap and > cancelling the claim at the end of the meminit() hook so this is never a real > problem. > > This tells me that the logic intent is to force early failure of > populate_physmap and nothing else. It's never active by the time ballooning or > memory exchange matter at all. Ah yes, this I find more convincing. (Oddly enough this is all x86-only code.) > Xen ought to cancel the claim by itself though, toolstack should not NEED to > do > it. Fundamentally yes. Except that the toolstack can do it earlier than Xen could. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |