Xen project Mailing List

Re: [PATCH v2] x86/domain: adjust limitation on shared_info allocation below 4G

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Thu, 5 Feb 2026 15:12:47 +0100

Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Thu, 05 Feb 2026 14:12:53 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 05.02.2026 15:08, Roger Pau Monné wrote: > On Thu, Feb 05, 2026 at 09:29:53AM +0100, Jan Beulich wrote: >> On 04.02.2026 17:46, Roger Pau Monné wrote: >>> On Wed, Feb 04, 2026 at 04:08:21PM +0100, Jan Beulich wrote: >>>> On 04.02.2026 15:52, Roger Pau Monné wrote: >>>>> On Wed, Feb 04, 2026 at 03:06:52PM +0100, Jan Beulich wrote: >>>>>> On 04.02.2026 13:25, Roger Pau Monne wrote: >>>>>>> The limitation of shared_info being allocated below 4G to fit in the >>>>>>> start_info field only applies to 32bit PV guests. On 64bit PV guests >>>>>>> the >>>>>>> start_info field is 64bits wide. HVM guests don't use start_info at >>>>>>> all. >>>>>>> >>>>>>> Drop the restriction in arch_domain_create() and instead free and >>>>>>> re-allocate the page from memory below 4G if needed in switch_compat(), >>>>>>> when the guest is set to run in 32bit PV mode. >>>>>>> >>>>>>> Fixes: 3cadc0469d5c ("x86_64: shared_info must be allocated below 4GB >>>>>>> as it is advertised to 32-bit guests via a 32-bit machine address field >>>>>>> in start_info.") >>>>>> >>>>>> The tag is here because there is the (largely theoretical?) possibility >>>>>> for >>>>>> a system to have no memory at all left below 4Gb, in which case creation >>>>>> of >>>>>> a PV64 or non-shadow HVM guest would needlessly fail? >>>>> >>>>> It's kid of an issue we discovered when using strict domain NUMA node >>>>> placement. At that point the toolstack would exhaust all memory on >>>>> node 0 and by doing that inadvertently consume all memory below 4G. >>>> >>>> Right, and hence also my "memory: arrange to conserve on DMA reservation", >>>> where I'm still fighting with myself as to what to do with the comments you >>>> gave there. >>> >>> Better fighting with yourself rather than fighting with me I guess ;). >>> >>> That change would be controversial with what we currently do on >>> XenServer, because we don't (yet) special case the memory below 4G to >>> not account for it in the per node free amount of memory. >>> >>> What would happen when you append the MEMF_no_dma flag as proposed in >>> your commit, but the caller is also passing MEMF_exact_node with >>> target node 0? AFAICT the allocation would still refuse to use the >>> low 4G pool. >> >> Yes, DMA-ability is intended to take higher priority than exact-node >> requests. Another node would then need choosing by the toolstack. >> >>> Also, your commit should also be expanded to avoid staking claims that >>> would drain the DMA pool, as then populate_physmap() won't be able to >>> allocate from there? >> >> Except that upstream claims aren't node-specific, yet, so could be >> fulfilled my taking memory from other nodes? > > That's likely to change at some point, but yes, they are not node > specific yet. > >> Aiui the problem would >> only occur if that DAM-able memory was the only memory left in the >> system. > > Indeed, in that scenario toolstack will be allowed to make claims that > cover that DMA memory, yet populate physmap won't be able to consume > those claims. It would be (following said patch of mine), but only in order-0 chunks. Which would make ... > I think there are two item that need to be done for us to append > MEMF_no_dma to populate physmap allocations: > > * DMA memory is not reachable by claims. > * DMA memory must be reported to the toolstack, so it can account for > it separately from free memory. > > Last point could also be solved by subtracting the DMA memory from the > `free_pages` value returned to the toolstack. ... any of this more difficult. We don't want to completely prevent its use, we only want to (heuristically) limit it. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.