|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2] x86/domain: adjust limitation on shared_info allocation below 4G
On 04.02.2026 17:46, Roger Pau Monné wrote:
> On Wed, Feb 04, 2026 at 04:08:21PM +0100, Jan Beulich wrote:
>> On 04.02.2026 15:52, Roger Pau Monné wrote:
>>> On Wed, Feb 04, 2026 at 03:06:52PM +0100, Jan Beulich wrote:
>>>> On 04.02.2026 13:25, Roger Pau Monne wrote:
>>>>> The limitation of shared_info being allocated below 4G to fit in the
>>>>> start_info field only applies to 32bit PV guests. On 64bit PV guests the
>>>>> start_info field is 64bits wide. HVM guests don't use start_info at all.
>>>>>
>>>>> Drop the restriction in arch_domain_create() and instead free and
>>>>> re-allocate the page from memory below 4G if needed in switch_compat(),
>>>>> when the guest is set to run in 32bit PV mode.
>>>>>
>>>>> Fixes: 3cadc0469d5c ("x86_64: shared_info must be allocated below 4GB as
>>>>> it is advertised to 32-bit guests via a 32-bit machine address field in
>>>>> start_info.")
>>>>
>>>> The tag is here because there is the (largely theoretical?) possibility for
>>>> a system to have no memory at all left below 4Gb, in which case creation of
>>>> a PV64 or non-shadow HVM guest would needlessly fail?
>>>
>>> It's kid of an issue we discovered when using strict domain NUMA node
>>> placement. At that point the toolstack would exhaust all memory on
>>> node 0 and by doing that inadvertently consume all memory below 4G.
>>
>> Right, and hence also my "memory: arrange to conserve on DMA reservation",
>> where I'm still fighting with myself as to what to do with the comments you
>> gave there.
>
> Better fighting with yourself rather than fighting with me I guess ;).
>
> That change would be controversial with what we currently do on
> XenServer, because we don't (yet) special case the memory below 4G to
> not account for it in the per node free amount of memory.
>
> What would happen when you append the MEMF_no_dma flag as proposed in
> your commit, but the caller is also passing MEMF_exact_node with
> target node 0? AFAICT the allocation would still refuse to use the
> low 4G pool.
Yes, DMA-ability is intended to take higher priority than exact-node
requests. Another node would then need choosing by the toolstack.
> Also, your commit should also be expanded to avoid staking claims that
> would drain the DMA pool, as then populate_physmap() won't be able to
> allocate from there?
Except that upstream claims aren't node-specific, yet, so could be
fulfilled my taking memory from other nodes? Aiui the problem would
only occur if that DAM-able memory was the only memory left in the
system.
Jan
> It would be weird for the toolstack to have
> successfully made a claim, and then populate_physmap() returning
> -ENOMEM because (part of) the claim has been made against the DMA
> pool, which populate_physmap() would explicitly forbidden to allocate
> from.
>
> Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |