[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] SYSCTL_numainfo.memsize: Switch spanned to present memory


  • To: Bernhard Kaindl <bernhard.kaindl@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 9 Dec 2024 09:23:52 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 09 Dec 2024 08:24:06 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 05.12.2024 11:55, Bernhard Kaindl wrote:
> On 03/12/2024 12:37, Jan Beulich wrote:
>> On 03.12.2024 12:12, Bernhard Kaindl wrote:
>>> This the 2nd part of my submission to fix the NUMA node memsize
>>> retured in xen_sysctl_meminfo[].memsize by the XEN_SYSCTL_numainfo
>>> hypercall to not count MMIO memory holes etc but only memory pages.
>>>
>>> For this, we introduced NODE_DATA->node_present_pages as a prereq.
>>> With the prereq merged in master, I send this 2nd part for review:
>>>
>>> This RFC is for changing the value of xen_sysctl_meminfo[]->memsize
>>> from NODE_DATA->node_spanned_pages << PAGE_SHIFT
>>>    to NODE_DATA->node_present_pages << PAGE_SHIFT
>>> for returing total present NUMA node memory instead of spanned range.
>>>
>>> Sample of struct xen_sysctl_meminfo[].* as presented by in xl info -n:
>>>
>>> xl info -n:
>>> [...]
>>> node:    memsize    memfree    distances
>>>     0:  -> 67584 <-   60672      10,21
>>>     1:     65536      60958      21,10
>>>
>>> The -> memsize <- marked here is the value that we'd like to fix:
>>> The current value based node_spanned_pages is often 2TB too large.
>>>
>>> We're currently not using these often false memsize values in XenServer
>>> according to my code review and and Andrew seemed to confirm this as well.
>>>
>>> I think that the same is likely true for other Xen toolstacks, but of course
>>> to review this change or propose an alternaive is the purpose of this RFC.
>>>
>>> Thanks,
>>> Bernhard
>>
>> All of the above reads like a cover letter. What's missing is a patch
>> description, part of which would be to clarify whether the field is
>> indeed unused except for display purposes, or why respective users would
>> at least not regress from this change. What's also unclear is what
>> comments you're actually after (i.e. what question(s) you want to have
>> answered), seeing this is tagged RFC.
> [...]
>> Jan
> 
> Hi Jan!
> 
> The answer I'm looking for is which users to check, or to check with.
> 
> For example, I know that Xapi can use xen_sysctl_meminfo[].memfree to
> get a preference about the NUMA node use use when creating a domain
> (when the new mode `numa_affinity_policy.best_effort` is enabled):
> https://xapi-project.github.io/new-docs/toolstack/features/NUMA/
> 
> A potential use of xen_sysctl_meminfo.memsize in Xen toolstacks is
> less clear to me:
> 
> The only potential use would be if some Xen toolstack would not like
> to solely rely on [nid].memfree for NUMA placement.
> 
> The question is if there are other NUMA aware toolstacks besides Xapi,
> that would try to use it for e.g. planning the placement of domains.
> 
> My in the Xapi and Xen repos only turned up a debug printf() in
> xen-api's xen-api/xenopsd and in xen only the output of xl info -n.
> 
> It seems questionable to me that any other toolstacks would rely on it,
> especially as the value it returns currently is offset even 2GB on some
> machines. I'd expect that this bug would have affected code using it.
> 
> The answers I am looking for are acknowledgements of that or references 
> which users might use .memsize currently (that could be affected).

IOW all questions to respective toolstack people.

> Alternatively, I'd hope to get an idea what would be the method to 
> create a new revision of the numainfo hypercall:
> 
> I guess it would be to add a new #define XEN_SYSCTL_numainfo_v2,
> and if v2 is called, return [].memsize using [nid].node_present_pages 
> instead?

That's a last resort, yes. Since sysctls aren't stable (yet), changing
existing interfaces generally is an option. We merely want to figure
how careful we need to be. It may be fine to do the change "silently",
as you do now. A middle option might be to rename the field which has
its meaning changed, such that anyone using the field will notice that
they need to update their code, hopefully resulting in them checking
what changed and hence what they may need to change.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.