[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction


  • To: Andrea Bastoni <andrea.bastoni@xxxxxxxxxxxxxxx>, Carlo Nonato <carlo.nonato@xxxxxxxxxxxxxxx>, Julien Grall <julien@xxxxxxx>
  • From: Michal Orzel <michal.orzel@xxxxxxx>
  • Date: Fri, 13 Dec 2024 10:45:59 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=minervasys.tech smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8D9BD01jgMhk+GCEvIWAN/G0tU0XeYCD+AiRCx8+xck=; b=gpOnU7xH+iwGkhHHTjVy6ETzzhtyK6gA+mOUO71WRRXZGMB+gbTv1txtPpsOIIjbaxj59OHcKjT4llAlI0bmhRkicP73Acrg+cd9WTrdsywMnHb0brcOOpoMClvJwFdRAYK+5K5/2SX3g8J3Ps+eNSSVfRAkxAymHVu0hIVmwGS3WTp+/eXDjWeuMW8ZZeYNwQeaek/If56coPzX92u8zwST4StL1n0j6SySwUW1glrQS+RMnWs7da0+T9ayHrG6DjEKd0kbNAntqMpmeqvdSybcjpJEGWlB48cdwqOukrcqLtEtB4IpJ2it/4jq0gO7nW33VTbNWipkH/rn+lIGZg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=CMhuoftOMIGukJGWoMU8TfIj0+8l9FWrGaqnGQ5GRzEfJtFLWTz8X2kU6dIA31hc9tUMYnRukwlESbchIwuR5qguTQTUSeqIMBuvp3krUs5glWI0lfyV9S+5qncgmodTOtGFAvmV7qbsdMu10aJ7vSBWgFqHpUn+t4LSKULbiTJB5KWACbCkfXCSyComQe6z+7XHI60zEv6b453tvQxh8MtecgbsvSll6pxXH6xZiSONYCdXcng6B19vtomgLHrBjwHwfFF/APvoPyvs/No++MGclRfQRLz5JBvoUWASYyj8Ve6+Pj23rEdNp3u3ot1EhvGN3GW+/y0pC1ZoMD+YsA==
  • Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>, <marco.solieri@xxxxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Delivery-date: Fri, 13 Dec 2024 09:46:45 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Carlo, Andrea,

On 12/12/2024 19:22, Andrea Bastoni wrote:
> 
> 
> On 12/12/2024 18:48, Carlo Nonato wrote:
>> Hi,
>>
>> On Mon, Dec 9, 2024 at 8:17 PM Julien Grall <julien@xxxxxxx> wrote:
>>>
>>> Hi Michal,
>>>
>>> On 07/12/2024 15:04, Michal Orzel wrote:
>>>>
>>>>
>>>> On 06/12/2024 19:37, Julien Grall wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> Sorry for the late answer.
>>>>>
>>>>> On 05/12/2024 09:40, Michal Orzel wrote:
>>>>>>
>>>>>>
>>>>>> On 02/12/2024 17:59, Carlo Nonato wrote:
>>>>>>>
>>>>>>>
>>>>>>> Cache coloring requires Dom0 not to be direct-mapped because of its non
>>>>>>> contiguous mapping nature, so allocate_memory() is needed in this case.
>>>>>>> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate 
>>>>>>> module")
>>>>>>> moved allocate_memory() in dom0less_build.c. In order to use it
>>>>>>> in Dom0 construction bring it back to domain_build.c and declare it in
>>>>>>> domain_build.h.
>>>>>>>
>>>>>>> Take the opportunity to adapt the implementation of allocate_memory() so
>>>>>>> that it uses the host layout when called on the hwdom, via
>>>>>>> find_unallocated_memory().
>>>>>>>
>>>>>>> Signed-off-by: Carlo Nonato <carlo.nonato@xxxxxxxxxxxxxxx>
>>>>>>> ---
>>>>>>> v11:
>>>>>>> - GUEST_RAM_BANKS instead of hardcoding the number of banks in 
>>>>>>> allocate_memory()
>>>>>>> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
>>>>>>> - added a comment in allocate_memory() when skipping small banks
>>>>>>> v10:
>>>>>>> - fixed a compilation bug that happened when dom0less support was 
>>>>>>> disabled
>>>>>>> v9:
>>>>>>> - no changes
>>>>>>> v8:
>>>>>>> - patch adapted to new changes to allocate_memory()
>>>>>>> v7:
>>>>>>> - allocate_memory() now uses the host layout when called on the hwdom
>>>>>>> v6:
>>>>>>> - new patch
>>>>>>> ---
>>>>>>>    xen/arch/arm/dom0less-build.c           | 44 -----------
>>>>>>>    xen/arch/arm/domain_build.c             | 97 
>>>>>>> ++++++++++++++++++++++++-
>>>>>>>    xen/arch/arm/include/asm/domain_build.h |  1 +
>>>>>>>    3 files changed, 94 insertions(+), 48 deletions(-)
>>>>>>>
>>>>>>> diff --git a/xen/arch/arm/dom0less-build.c 
>>>>>>> b/xen/arch/arm/dom0less-build.c
>>>>>>> index d93a85434e..67b1503647 100644
>>>>>>> --- a/xen/arch/arm/dom0less-build.c
>>>>>>> +++ b/xen/arch/arm/dom0less-build.c
>>>>>>> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
>>>>>>>        return ( !dom0found && domUfound );
>>>>>>>    }
>>>>>>>
>>>>>>> -static void __init allocate_memory(struct domain *d, struct 
>>>>>>> kernel_info *kinfo)
>>>>>>> -{
>>>>>>> -    struct membanks *mem = kernel_info_get_mem(kinfo);
>>>>>>> -    unsigned int i;
>>>>>>> -    paddr_t bank_size;
>>>>>>> -
>>>>>>> -    printk(XENLOG_INFO "Allocating mappings totalling %ldMB for 
>>>>>>> %pd:\n",
>>>>>>> -           /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>>> -           (unsigned long)(kinfo->unassigned_mem >> 20), d);
>>>>>>> -
>>>>>>> -    mem->nr_banks = 0;
>>>>>>> -    bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
>>>>>>> -    if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
>>>>>>> -                               bank_size) )
>>>>>>> -        goto fail;
>>>>>>> -
>>>>>>> -    bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
>>>>>>> -    if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
>>>>>>> -                               bank_size) )
>>>>>>> -        goto fail;
>>>>>>> -
>>>>>>> -    if ( kinfo->unassigned_mem )
>>>>>>> -        goto fail;
>>>>>>> -
>>>>>>> -    for( i = 0; i < mem->nr_banks; i++ )
>>>>>>> -    {
>>>>>>> -        printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" 
>>>>>>> (%ldMB)\n",
>>>>>>> -               d,
>>>>>>> -               i,
>>>>>>> -               mem->bank[i].start,
>>>>>>> -               mem->bank[i].start + mem->bank[i].size,
>>>>>>> -               /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>>> -               (unsigned long)(mem->bank[i].size >> 20));
>>>>>>> -    }
>>>>>>> -
>>>>>>> -    return;
>>>>>>> -
>>>>>>> -fail:
>>>>>>> -    panic("Failed to allocate requested domain memory."
>>>>>>> -          /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>>> -          " %ldKB unallocated. Fix the VMs configurations.\n",
>>>>>>> -          (unsigned long)kinfo->unassigned_mem >> 10);
>>>>>>> -}
>>>>>>> -
>>>>>>>    #ifdef CONFIG_VGICV2
>>>>>>>    static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
>>>>>>>    {
>>>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>>>> index 2c30792de8..2b8cba9b2f 100644
>>>>>>> --- a/xen/arch/arm/domain_build.c
>>>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>>>> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain 
>>>>>>> *d,
>>>>>>>        }
>>>>>>>    }
>>>>>>>
>>>>>>> -#ifdef CONFIG_DOM0LESS_BOOT
>>>>>>>    bool __init allocate_domheap_memory(struct domain *d, paddr_t 
>>>>>>> tot_size,
>>>>>>>                                        alloc_domheap_mem_cb cb, void 
>>>>>>> *extra)
>>>>>>>    {
>>>>>>> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info 
>>>>>>> *kinfo, gfn_t sgfn,
>>>>>>>
>>>>>>>        return true;
>>>>>>>    }
>>>>>>> -#endif
>>>>>>>
>>>>>>>    /*
>>>>>>>     * When PCI passthrough is available we want to keep the
>>>>>>> @@ -1003,6 +1001,94 @@ out:
>>>>>>>        return res;
>>>>>>>    }
>>>>>>>
>>>>>>> +void __init allocate_memory(struct domain *d, struct kernel_info 
>>>>>>> *kinfo)
>>>>>>> +{
>>>>>>> +    struct membanks *mem = kernel_info_get_mem(kinfo);
>>>>>>> +    unsigned int i, nr_banks = GUEST_RAM_BANKS;
>>>>>>> +    paddr_t bank_start, bank_size;
>>>>>> Limit the scope
>>>>>>
>>>>>>> +    struct membanks *hwdom_free_mem = NULL;
>>>>>>> +    const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>>>>>>> +    const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
>>>>>> Limit the scope
>>>>>>
>>>>>>> +
>>>>>>> +    printk(XENLOG_INFO "Allocating mappings totalling %ldMB for 
>>>>>>> %pd:\n",
>>>>>>> +           /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>>> +           (unsigned long)(kinfo->unassigned_mem >> 20), d);
>>>>>>> +
>>>>>>> +    mem->nr_banks = 0;
>>>>>>> +    /*
>>>>>>> +     * Use host memory layout for hwdom. Only case for this is when 
>>>>>>> LLC coloring
>>>>>>> +     * is enabled.
>>>>>>> +     */
>>>>>>> +    if ( is_hardware_domain(d) )
>>>>>>> +    {
>>>>>>> +        ASSERT(llc_coloring_enabled);
>>>>>> This patch does not build because of declaration not being visible. You 
>>>>>> must include <xen/llc-coloring.h>.
>>>>>
>>>>> Piggying back on this comment. AFAICT, the code below would work also in
>>>>> the non cache coloring case. So what's the assert is for?
>>>>>
>>>>>>
>>>>>>> +
>>>>>>> +        hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
>>>>>>> +                                             NR_MEM_BANKS);
>>>>>>> +        if ( !hwdom_free_mem )
>>>>>>> +            goto fail;
>>>>>>> +
>>>>>>> +        hwdom_free_mem->max_banks = NR_MEM_BANKS;
>>>>>>> +
>>>>>>> +        if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
>>>>>> My remarks for the use of find_unallocated_memory() 1:1 have not been 
>>>>>> addressed. You did not even
>>>>>> change the comments inside the function. The problem is that the 
>>>>>> function is specifically designed
>>>>>> for finding extended regions and assumes being called at certain point 
>>>>>> i.e. dom0 RAM allocated, gnttab
>>>>>> region allocated, etc.
>>>>>
>>>>> So I agree that the function should be updated if we plan to use it for
>>>>> other purpose.
>>>>>
>>>>> My opinion is that we should attempt to make the function generic so
>>>>> that in your
>>>>>> case you can choose which regions to exclude, define even your own 
>>>>>> function to grab free regions (at the moment
>>>>>> add_ext_regions grabs banks >= 64M but you still discards banks >= 128M, 
>>>>>> so it's a bit wasteful.
>>>>>>
>>>>>> My very short attempt to make the function as generic as possible in the 
>>>>>> first iteration:
>>>>>> https://paste.debian.net/1338334/
>>>>>
>>>>> This looks better, but I wonder why we need still need to exclude the
>>>>> static regions? Wouldn't it be sufficient to exclude just reserved 
>>>>> regions?
>>>> Static shared memory banks are not part of reserved memory (i.e. 
>>>> bootinfo.reserved_mem) if that's what you're asking.
>>>> They are stored in bootinfo.shmem, hence we need to take them into account 
>>>> when searching for unused address space.
>>>
>>> Oh I missed the fact you now pass "mem_banks" as a parameter. I thought
>>> they would still get excluded for cache coloring case.
>>>
>>>>
>>>> If you and Carlo are ok with my proposed solution for making the function 
>>>> generic, I can send a patch as a prerequisite
>>>> patch for Carlo series.
>>>
>>> I am fine with the approach.
>>>
>>> Cheers,
>>>
>>> --
>>> Julien Grall
>>>
>>
>>> @@ -2152,7 +2238,10 @@ static int __init construct_dom0(struct domain *d)
>>>      /* type must be set before allocate_memory */
>>>      d->arch.type = kinfo.type;
>>>  #endif
>>> -    allocate_memory_11(d, &kinfo);
>>> +    if ( is_domain_direct_mapped(d) )
>>> +        allocate_memory_11(d, &kinfo);
>>> +    else
>>> +        allocate_memory(d, &kinfo);
>>>      find_gnttab_region(d, &kinfo);
>>
>> Since find_gnttab_region() is called after allocate_memory(), kinfo->gnttab_*
>> fields aren't initialized and the call to find_unallocated_memory() with
>> gnttab as the region to exclude, fails ending in a crash since memory for
>> dom0 can't be allocated.
>>
>> Can the solution be to call find_gnttab_region() before the above if?
> 
> The function is called find, but currently it only initializes 
> kinfo->gnttab_start
> and kinfo->gnttab_size and we tested that moving it before allocate_memory* 
> doesn't
> cause fallouts.
> 
> If moving before allocate_memory*, would it be better to rename it e.g., 
> init_gnttab_region()?
> 
> Thanks,
> Andrea
> 
>> Or should I just call it before allocate_memory() in one case, but still 
>> after
>> allocate_memory_11() in the other?
>>
>> Thanks.
> 

AFAICT there is nothing stopping us from moving find_gnttab_region() before 
allocate_*. This function initializes
gnttab region with PA of Xen. In normal case, because Xen is added as 
bootmodule, it will never be mapped in dom0 memory map
and the placement does not matter. In LLC case, it will point to relocated 
address of Xen and it needs to be known before
calling find_unallocated_memory. Don't rename it, leave as is, just move before 
allocate_*.

@Carlo:
My prerequisite patch has been merged, so you're good to respin a series 
(unless you wait for some feedback in which case do let me know).
To prevent too many respins, you're going to call find_unallocated_memory for 
LLC passing resmem and gnttab to be excluded. If you're going
to reuse add_ext_regions, you need to rename it and fix comments to make it 
more generic. As for the size, the decision is yours. One solution
would be to modify add_ext_regions to take min bank size as parameter (64MB for 
extended regions, X for LLC dom0). In your code, you write that
the first bank must contain dom0, dtb, ramdisk and you chose 128MB. However, 
looking at the code, you seem to discard banks < 128 for all the banks,
not only for the first one. This is the part that I don't have a ready 
solution. Maybe you could define your own add_free_region function and sort
the banks, so that you take the largest possible bank first for dom0. This 
could simplify things.

You can also ask others for opinion.

We are approaching Dec 20th deadline, and I want this series to be in as it's 
been on the list for too many years. I'm willing to accept a sub-optimal 
solution
(so far will be used only for LLC, and LLC as experimental feature will be the 
only victim of not optimal algorithm) for now, and we can think of a better one
after the release. But still, even the sub-optimal solution must make sense.

~Michal




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.