|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH V3 (resend) 11/19] x86/setup: Leave early boot slightly earlier
On 13.05.2024 15:40, Elias El Yandouzi wrote:
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1751,6 +1751,22 @@ void asmlinkage __init noreturn __start_xen(unsigned
> long mbi_p)
>
> numa_initmem_init(0, raw_max_page);
>
> + /*
> + * When we do not have a direct map, memory for metadata of heap nodes in
> + * init_node_heap() is allocated from xenheap, which needs to be mapped
> and
> + * unmapped on demand. However, we cannot just take memory from the boot
> + * allocator to create the PTEs while we are passing memory to the heap
> + * allocator during end_boot_allocator().
> + *
> + * To solve this race, we need to leave early boot before
> + * end_boot_allocator() so that Xen PTE pages are allocated from the heap
> + * instead of the boot allocator. We can do this because the metadata for
> + * the 1st node is statically allocated, and by the time we need memory
> to
> + * create mappings for the 2nd node, we already have enough memory in the
> + * heap allocator in the 1st node.
> + */
> + system_state = SYS_STATE_boot;
> +
> if ( max_page - 1 > virt_to_mfn(HYPERVISOR_VIRT_END - 1) )
> {
> unsigned long lo = virt_to_mfn(HYPERVISOR_VIRT_END - 1);
> @@ -1782,8 +1798,6 @@ void asmlinkage __init noreturn __start_xen(unsigned
> long mbi_p)
> else
> end_boot_allocator();
>
> - system_state = SYS_STATE_boot;
> -
> bsp_stack = cpu_alloc_stack(0);
> if ( !bsp_stack )
> panic("No memory for BSP stack\n");
I'm pretty wary of this movement, even more so when Arm isn't switched at
the same time. It has (virtually?) always been the case that this state
switch happens _after_ end_boot_allocator(), and I wouldn't be surprised
if there was a dependency on that somewhere. I realize you've been telling
use that at Amazon you've been running with an earlier variant of these
changes for a long time, and you not having hit issues with this is a good
sign. But I'm afraid it's not a proof.
As to possible alternatives - as pointed out by Roger, the comment / patch
description aren't entirely clear as to what exactly needs working around.
One possibility might be to introduce an x86-only boolean controlling from
when on to use the heap allocator for page table allocations, thus
decoupling that from system_state.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |