|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 03/10] xen/page_alloc: Implement NUMA-node-specific claims
On 05.03.2026 15:54, Bernhard Kaindl wrote: > Jan Beulich wrote: >> On 05.03.2026 14:12, Bernhard Kaindl wrote: >>> >>> Roger requested the domctl API to allow claiming from multiple nodes in one >>> go >>> and he specified that we should focus on getting the implementation for one >>> node-specific claim done first before we dive into multi-node claims code. >>> >>> - Instead of adding/linking an array of claims to struct domain, we can keep >>> using d->outstanding_pages for the single-node claim. >>> >>> - There are numerous comments and questions for this minimal implementation. >>> If we'd add multi-node claims to it, this review may become even more >>> complex. >>> >>> - The single-node claims backend contains the infrastructure and multi-node >>> claims would be an extension on top of that infrastructure. >> >> What is at the very least needed is an outline of how multi-node claims are >> intended to work. This is because what you do here needs to fit that scheme. >> Which in turn I think is going to be difficult when for a domain more memory >> is needed than any single node can supply. Hence why I think that you may >> not be able to get away with just single-node claims, no matter that this >> of course complicates things. >> >> It's also not quite clear to me how multiple successive claims against >> distinct nodes would work (which isn't all that different from a multi-node >> claim). >> >> Thinking of it, interaction with the existing mem-op also wants clarifying. >> Imo only one of the two ought to be usable on a single domain. > > Yes, correct. As implemented by Xen in domain_set_outstanding_claims(), > Xen claims work very different from something like an allocation: > > For example, when you allocate, you get memory, and when you repeat, > you have a bigger allocation. > > But Xen claims in domain_set_outstanding_claims() don't work like that: > > - When a domain has a claim, domain_set_outstanding_claims() only allows > to reset the claim to 0, nothing else. A second, or changed claim is not > possible. I think this was intentional: > > - domain_set_outstanding_claims() rejects increasing/reducing a claim: > > A claim is designed to be made by domain build when the size of the > domain is known. There is no tweaking it afterwards: The needed pages > shall be claimed by the domain builder before the domain is built. > > Note: The claims are not only consumed when populating guest memory: > Claims are also (at least attempted to be) consumed when Xen needs to > allocate memory for other resources of the domain. For this reason, > the domain builder needs to add some headroom for allocations done by > Xen for creating the domain. > > When the domain builder has finished building the domain, it is expected > to reset the claim to release any not consumed headroom it added. > > - If a domain already has memory when the domain builder stakes a claim > for completing the build of the domain, the outstanding_claims are set > to the target value of the claim call, minus domain_tot_pages(d), so > already allocated memory does not contribute to a bigger total booking. > > For NUMA claims and global host-level claims, it is similar: > > A NUMA node-specific claim is implicitly also added to the global > host-level outstanding_claims of the host, as a Node-specific memory > is also part of the host's memory, so the host-level claims protection > does not have to also check for node-specific claims: > > The effect of host-level claim is also given when you make a node-level claim. > > When a domain one kind of claim, it does not make a lot of sense to then > later add a differently sized claim for another target. Like described in > how domain_set_outstanding_claims() is implemented, a domain builder stakes > a claim once, then builds the domain, then resets it, and that's all to it. > > For example, Xapi toolstack and libxenguest have calls to claim memory, > but in any given configuration, only the first actor to claim memory for > a domain is the one who defines the claim: No mixing, changing, updating. > It makes things clear that the initial creator did make the claim. > > Similar for multi-node claims: > > Roger described how he wants this API do work here: > https://lists.xenproject.org/archives/html/xen-devel/2025-06/msg00484.html Fits my understanding, but doesn't fit you limiting the new sub-op to a single node. As said, if you introduce the new sub-op this way, I'd still expect for a single domain to have claims across multiple nodes, and that (preferably) whatever the caller does to achieve that will continue to work once the restriction is lifted. Yet I can't see you describe such claims-on-multiple-nodes use case in of your reply above. And indeed to achieve that you'd need data layout changes, in particular there then couldn't be any single d->claim_node. >> Ideally, we would need to introduce a new hypercall that allows making >> claims from multiple nodes in a single locked region, as to ensure >> success or failure in an atomic way. > > In the locked region (inside heap_lock), we can check the claims requests > against existing claims and memory of the affected nodes and determine if > the claim call is a go or a no-go. If it is a go, we update all counters > which are all protected by the heap_lock and are done. Yet as per above, afaics you don't even have the needed data layout to record two (or more) claims against distinct nodes. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |