[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 03/10] xen/page_alloc: Implement NUMA-node-specific claims


  • To: Bernhard Kaindl <bernhard.kaindl@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 5 Mar 2026 18:00:19 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Andrew Cooper <andrew.cooper@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Marcus Granado <Marcus.Granado@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Alejandro Vallejo <Alejandro.GarciaVallejo@xxxxxxx>
  • Delivery-date: Thu, 05 Mar 2026 17:00:44 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 05.03.2026 15:54, Bernhard Kaindl wrote:
> Jan Beulich wrote:
>> On 05.03.2026 14:12, Bernhard Kaindl wrote:
>>>
>>> Roger requested the domctl API to allow claiming from multiple nodes in one 
>>> go
>>> and he specified that we should focus on getting the implementation for one
>>> node-specific claim done first before we dive into multi-node claims code.
>>>
>>> - Instead of adding/linking an array of claims to struct domain, we can keep
>>>   using d->outstanding_pages for the single-node claim.
>>>
>>> - There are numerous comments and questions for this minimal implementation.
>>>   If we'd add multi-node claims to it, this review may become even more 
>>> complex.
>>>
>>> - The single-node claims backend contains the infrastructure and multi-node
>>>   claims would be an extension on top of that infrastructure.
>>
>> What is at the very least needed is an outline of how multi-node claims are
>> intended to work. This is because what you do here needs to fit that scheme.
>> Which in turn I think is going to be difficult when for a domain more memory
>> is needed than any single node can supply. Hence why I think that you may
>> not be able to get away with just single-node claims, no matter that this
>> of course complicates things.
>>
>> It's also not quite clear to me how multiple successive claims against
>> distinct nodes would work (which isn't all that different from a multi-node
>> claim).
>>
>> Thinking of it, interaction with the existing mem-op also wants clarifying.
>> Imo only one of the two ought to be usable on a single domain.
> 
> Yes, correct. As implemented by Xen in domain_set_outstanding_claims(),
> Xen claims work very different from something like an allocation:
> 
> For example, when you allocate, you get memory, and when you repeat,
> you have a bigger allocation.
> 
> But Xen claims in domain_set_outstanding_claims() don't work like that:
> 
> - When a domain has a claim, domain_set_outstanding_claims() only allows
>   to reset the claim to 0, nothing else. A second, or changed claim is not
>   possible. I think this was intentional:
> 
>   - domain_set_outstanding_claims() rejects increasing/reducing a claim:
> 
>     A claim is designed to be made by domain build when the size of the
>     domain is known. There is no tweaking it afterwards: The needed pages
>     shall be claimed by the domain builder before the domain is built.
>     
>     Note: The claims are not only consumed when populating guest memory:
>     Claims are also (at least attempted to be) consumed when Xen needs to
>     allocate memory for other resources of the domain. For this reason,
>     the domain builder needs to add some headroom for allocations done by
>     Xen for creating the domain.
> 
>     When the domain builder has finished building the domain, it is expected
>     to reset the claim to release any not consumed headroom it added.
> 
>   - If a domain already has memory when the domain builder stakes a claim
>     for completing the build of the domain, the outstanding_claims are set
>     to the target value of the claim call, minus domain_tot_pages(d), so
>     already allocated memory does not contribute to a bigger total booking.
> 
> For NUMA claims and global host-level claims, it is similar:
> 
> A NUMA node-specific claim is implicitly also added to the global
> host-level outstanding_claims of the host, as a Node-specific memory
> is also part of the host's memory, so the host-level claims protection
> does not have to also check for node-specific claims:
> 
> The effect of host-level claim is also given when you make a node-level claim.
> 
> When a domain one kind of claim, it does not make a lot of sense to then
> later add a differently sized claim for another target. Like described in
> how domain_set_outstanding_claims() is implemented, a domain builder stakes
> a claim once, then builds the domain, then resets it, and that's all to it.
> 
> For example, Xapi toolstack and libxenguest have calls to claim memory,
> but in any given configuration, only the first actor to claim memory for
> a domain is the one who defines the claim: No mixing, changing, updating.
> It makes things clear that the initial creator did make the claim.
> 
> Similar for multi-node claims:
> 
> Roger described how he wants this API do work here:
> https://lists.xenproject.org/archives/html/xen-devel/2025-06/msg00484.html

Fits my understanding, but doesn't fit you limiting the new sub-op to a
single node. As said, if you introduce the new sub-op this way, I'd still
expect for a single domain to have claims across multiple nodes, and
that (preferably) whatever the caller does to achieve that will continue
to work once the restriction is lifted.

Yet I can't see you describe such claims-on-multiple-nodes use case in
of your reply above. And indeed to achieve that you'd need data layout
changes, in particular there then couldn't be any single d->claim_node.

>> Ideally, we would need to introduce a new hypercall that allows making
>> claims from multiple nodes in a single locked region, as to ensure
>> success or failure in an atomic way.
> 
> In the locked region (inside heap_lock), we can check the claims requests
> against existing claims and memory of the affected nodes and determine if
> the claim call is a go or a no-go. If it is a go, we update all counters
> which are all protected by the heap_lock and are done.

Yet as per above, afaics you don't even have the needed data layout to
record two (or more) claims against distinct nodes.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.