[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v12 03/12] xen/arm: permit non direct-mapped Dom0 construction


  • To: Carlo Nonato <carlo.nonato@xxxxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Michal Orzel <michal.orzel@xxxxxxx>
  • Date: Mon, 16 Dec 2024 13:08:28 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=minervasys.tech smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fBeJCbgxaAyLeBYJ1WxdC5gReHzTab06dZAagJ453To=; b=cSbCJ0aJ3+XOGoOOvAVNDaVDLGB8WjFnQsDHzRVdB3Co2Ld4KLNlLIpjfk3OfMOk2v4O4auGQHFuq+7AoWyX9DPyetb/GQDYh22Gj/MDgfvGByl0VPsGNIvGSs9Agj2KnBy4PSP/OuYIdXjv11VuSE4XmrVyz5/3LdszJQvdy4SAI+MFXYJkb75he6gFtr9sxa3mylNhPLCl1HpGEb50fMVhFV3WFIAmxZ/uIZ+Y/yIAzLjrdDlXWaKmOT9l0Ph3t9yg2N+LgGOOxoMdxv4EInWIFIrS4MAgP4x7aQ17XyzYQQgKk/lGCMWlj3zI6rYp+8a/TXKZjYFLZA4juUXFjQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=i2p22iqhFYpCS2fMcC0mjBUxBYSKt/9rL7FVGubKmtTtq7QCZV3UsAGrnkPMM/wh4WWPQ06Vus9dQKg8qw8Ro7Fna1IVxleWNzXJ/YkcNJt5BLmHHsocFwdEf8hixY+C0xPrySbTyoN/ZoHlX3HYgh3J1H+mayGzMJLP8a+ZFDDzBJsomy8b7DvAStRSAs0i3DD/s+Y/0RDaeP++2y+xG8eEJLkF0vhpuRLHCGaLY6MT1bWPGxROiGSDu877y2H/9tIaByvBUqqau0tk2CM26LhOUig/maB50s+WFD1fnXRHQMj3XQiMh6L5HNd7qfntAgvleZ+xDHVsRpN3agKXKw==
  • Cc: <andrea.bastoni@xxxxxxxxxxxxxxx>, <marco.solieri@xxxxxxxxxxxxxxx>, "Stefano Stabellini" <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, "Bertrand Marquis" <bertrand.marquis@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Delivery-date: Mon, 16 Dec 2024 12:08:49 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


On 13/12/2024 17:28, Carlo Nonato wrote:
> 
> 
> Cache coloring requires Dom0 not to be direct-mapped because of its non
> contiguous mapping nature, so allocate_memory() is needed in this case.
> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
> moved allocate_memory() in dom0less_build.c. In order to use it
> in Dom0 construction bring it back to domain_build.c and declare it in
> domain_build.h.
> 
> Adapt the implementation of allocate_memory() so that it uses the host
> layout when called on the hwdom, via find_unallocated_memory(). Take the
> opportunity to keep the parameter order consistent with
> rangeset_report_ranges() and move the membanks struct after the callback
> function in find_unallocated_memory().
Why? What benefit does this change (that is irrelevant to the goal of this 
patch) provide?

> 
> Introduce add_hwdom_free_regions() callback to add hwdom banks in descending
> order.
> 
> Signed-off-by: Carlo Nonato <carlo.nonato@xxxxxxxxxxxxxxx>
> ---
> v12:
> - used the new generic find_unallocated_memory()
> - added add_hwdom_free_regions() callback
> v11:
> - GUEST_RAM_BANKS instead of hardcoding the number of banks in 
> allocate_memory()
> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
> - added a comment in allocate_memory() when skipping small banks
> v10:
> - fixed a compilation bug that happened when dom0less support was disabled
> v9:
> - no changes
> v8:
> - patch adapted to new changes to allocate_memory()
> v7:
> - allocate_memory() now uses the host layout when called on the hwdom
> v6:
> - new patch
> ---
>  xen/arch/arm/dom0less-build.c           |  44 -------
>  xen/arch/arm/domain_build.c             | 164 +++++++++++++++++++++++-
>  xen/arch/arm/include/asm/domain_build.h |   4 +
>  3 files changed, 161 insertions(+), 51 deletions(-)
> 
> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
> index d93a85434e..67b1503647 100644
> --- a/xen/arch/arm/dom0less-build.c
> +++ b/xen/arch/arm/dom0less-build.c
> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
>      return ( !dom0found && domUfound );
>  }
> 
> -static void __init allocate_memory(struct domain *d, struct kernel_info 
> *kinfo)
> -{
> -    struct membanks *mem = kernel_info_get_mem(kinfo);
> -    unsigned int i;
> -    paddr_t bank_size;
> -
> -    printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> -           /* Don't want format this as PRIpaddr (16 digit hex) */
> -           (unsigned long)(kinfo->unassigned_mem >> 20), d);
> -
> -    mem->nr_banks = 0;
> -    bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
> -    if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
> -                               bank_size) )
> -        goto fail;
> -
> -    bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
> -    if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
> -                               bank_size) )
> -        goto fail;
> -
> -    if ( kinfo->unassigned_mem )
> -        goto fail;
> -
> -    for( i = 0; i < mem->nr_banks; i++ )
> -    {
> -        printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" 
> (%ldMB)\n",
> -               d,
> -               i,
> -               mem->bank[i].start,
> -               mem->bank[i].start + mem->bank[i].size,
> -               /* Don't want format this as PRIpaddr (16 digit hex) */
> -               (unsigned long)(mem->bank[i].size >> 20));
> -    }
> -
> -    return;
> -
> -fail:
> -    panic("Failed to allocate requested domain memory."
> -          /* Don't want format this as PRIpaddr (16 digit hex) */
> -          " %ldKB unallocated. Fix the VMs configurations.\n",
> -          (unsigned long)kinfo->unassigned_mem >> 10);
> -}
> -
>  #ifdef CONFIG_VGICV2
>  static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
>  {
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index adf26f2778..bfcff99194 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -2,6 +2,7 @@
>  #include <xen/init.h>
>  #include <xen/compile.h>
>  #include <xen/lib.h>
> +#include <xen/llc-coloring.h>
>  #include <xen/mm.h>
>  #include <xen/param.h>
>  #include <xen/domain_page.h>
> @@ -416,7 +417,6 @@ static void __init allocate_memory_11(struct domain *d,
>      }
>  }
> 
> -#ifdef CONFIG_DOM0LESS_BOOT
>  bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
>                                      alloc_domheap_mem_cb cb, void *extra)
>  {
> @@ -508,7 +508,6 @@ bool __init allocate_bank_memory(struct kernel_info 
> *kinfo, gfn_t sgfn,
> 
>      return true;
>  }
> -#endif
> 
>  /*
>   * When PCI passthrough is available we want to keep the
> @@ -900,6 +899,52 @@ int __init add_ext_regions(unsigned long s_gfn, unsigned 
> long e_gfn,
>      return 0;
>  }
> 
> +static int __init add_hwdom_free_regions(unsigned long s_gfn,
> +                                         unsigned long e_gfn, void *data)
> +{
> +    struct membanks *free_regions = data;
> +    paddr_t start, size;
> +    paddr_t s = pfn_to_paddr(s_gfn);
> +    paddr_t e = pfn_to_paddr(e_gfn);
> +    unsigned int i, j;
> +
> +    if ( free_regions->nr_banks >= free_regions->max_banks )
> +        return 0;
> +
> +    /*
> +     * Both start and size of the free region should be 2MB aligned to
> +     * potentially allow superpage mapping.
> +     */
> +    start = (s + SZ_2M - 1) & ~(SZ_2M - 1);
> +    if ( start > e )
> +        return 0;
> +
> +    /*
> +     * e is actually "end-1" because it is called by rangeset functions
> +     * which are inclusive of the last address.
> +     */
> +    e += 1;
> +    size = (e - start) & ~(SZ_2M - 1);
> +
> +    /* Find the insert position (descending order). */
> +    for ( i = 0; i < free_regions->nr_banks ; i++)
CODING_STYLE:
for ( i = 0; i < free_regions->nr_banks; i++ )

> +        if ( size > free_regions->bank[i].size )
> +            break;
> +
> +    /* Move the other banks to make space. */
> +    for ( j = free_regions->nr_banks; j > i ; j-- )
> +    {
> +        free_regions->bank[j].start = free_regions->bank[j - 1].start;
> +        free_regions->bank[j].size = free_regions->bank[j - 1].size;
> +    }
> +
> +    free_regions->bank[i].start = start;
> +    free_regions->bank[i].size = size;
> +    free_regions->nr_banks++;
The algorithm looks good. In my head I thought you will use sort() after adding 
all the banks, but
I'm not sure which solution is more efficient. Probably yours and it avoids 
implementing cmp and swap functions.

> +
> +    return 0;
> +}
> +
>  /*
>   * Find unused regions of Host address space which can be exposed to domain
>   * using the host memory layout. In order to calculate regions we exclude 
> every
> @@ -908,10 +953,10 @@ int __init add_ext_regions(unsigned long s_gfn, 
> unsigned long e_gfn,
>  static int __init find_unallocated_memory(const struct kernel_info *kinfo,
>                                            const struct membanks *mem_banks[],
>                                            unsigned int nr_mem_banks,
> -                                          struct membanks *free_regions,
>                                            int (*cb)(unsigned long s_gfn,
>                                                      unsigned long e_gfn,
> -                                                    void *data))
> +                                                    void *data),
> +                                          struct membanks *free_regions)
>  {
>      const struct membanks *mem = bootinfo_get_mem();
>      struct rangeset *unalloc_mem;
> @@ -977,6 +1022,108 @@ out:
>      return res;
>  }
> 
> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> +{
> +    struct membanks *mem = kernel_info_get_mem(kinfo);
> +    unsigned int i, nr_banks = GUEST_RAM_BANKS;
> +    struct membanks *hwdom_free_mem = NULL;
> +
> +    printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> +           /* Don't want format this as PRIpaddr (16 digit hex) */
> +           (unsigned long)(kinfo->unassigned_mem >> 20), d);
> +
> +    mem->nr_banks = 0;
> +    /*
> +     * Use host memory layout for hwdom. Only case for this is when LLC 
> coloring
> +     * is enabled.
> +     */
> +    if ( is_hardware_domain(d) )
> +    {
> +        struct membanks *gnttab = xzalloc_flex_struct(struct membanks, bank, 
> 1);
> +        /*
> +         * Exclude the following regions:
> +         * 1) Remove reserved memory
> +         * 2) Grant table assigned to Dom0
Can we not mention 'Dom0'? In the future hwdom may not necessarily be dom0. 
Especially that
in other places you mention hwdom.

> +         */
> +        const struct membanks *mem_banks[] = {
> +            bootinfo_get_reserved_mem(),
> +            gnttab,
> +        };
> +
> +        ASSERT(llc_coloring_enabled);
Remove this assert. There's nothing LLC special here and this could be re-used 
in the future
to provide non 1:1 hwdom.

> +
> +        if ( !gnttab )
> +            goto fail;
> +
> +        gnttab->nr_banks = 1;
> +        gnttab->bank[0].start = kinfo->gnttab_start;
> +        gnttab->bank[0].size = kinfo->gnttab_start + kinfo->gnttab_size;
Incorrect. You assign to 'end' to'size'. It should simply be:
gnttab->bank[0].size = kinfo->gnttab_size.

> +
> +        hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
> +                                             NR_MEM_BANKS);
> +        if ( !hwdom_free_mem )
> +            goto fail;
> +
> +        hwdom_free_mem->max_banks = NR_MEM_BANKS;
> +
> +        if ( find_unallocated_memory(kinfo, mem_banks, ARRAY_SIZE(mem_banks),
> +                                     add_hwdom_free_regions, hwdom_free_mem) 
> )
> +            goto fail;
> +
> +        nr_banks = hwdom_free_mem->nr_banks;
> +        xfree(gnttab);
> +    }
> +
> +    for ( i = 0; kinfo->unassigned_mem > 0 && nr_banks > 0; i++, nr_banks-- )
> +    {
> +        paddr_t bank_start, bank_size;
> +
> +        if ( is_hardware_domain(d) )
> +        {
> +            bank_start = hwdom_free_mem->bank[i].start;
> +            bank_size = hwdom_free_mem->bank[i].size;
> +        }
> +        else
> +        {
> +            const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> +            const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
> +
> +            if ( i >= GUEST_RAM_BANKS )
> +                goto fail;
> +
> +            bank_start = bankbase[i];
> +            bank_size = banksize[i];
> +        }
> +
> +        bank_size = MIN(bank_size, kinfo->unassigned_mem);
> +        if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(bank_start), 
> bank_size) )
> +            goto fail;
> +    }
> +
> +    if ( kinfo->unassigned_mem )
> +        goto fail;
> +
> +    for( i = 0; i < mem->nr_banks; i++ )
> +    {
> +        printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" 
> (%ldMB)\n",
> +               d,
> +               i,
> +               mem->bank[i].start,
> +               mem->bank[i].start + mem->bank[i].size,
> +               /* Don't want format this as PRIpaddr (16 digit hex) */
> +               (unsigned long)(mem->bank[i].size >> 20));
> +    }
> +
> +    xfree(hwdom_free_mem);
> +    return;
> +
> +fail:
> +    panic("Failed to allocate requested domain memory."
> +          /* Don't want format this as PRIpaddr (16 digit hex) */
> +          " %ldKB unallocated. Fix the VMs configurations.\n",
> +          (unsigned long)kinfo->unassigned_mem >> 10);
> +}
> +
>  static int __init handle_pci_range(const struct dt_device_node *dev,
>                                     uint64_t addr, uint64_t len, void *data)
>  {
> @@ -1176,7 +1323,7 @@ static int __init find_host_extended_regions(const 
> struct kernel_info *kinfo,
>      gnttab->bank[0].size = kinfo->gnttab_size;
> 
>      res = find_unallocated_memory(kinfo, mem_banks, ARRAY_SIZE(mem_banks),
> -                                  ext_regions, add_ext_regions);
> +                                  add_ext_regions, ext_regions);
>      xfree(gnttab);
> 
>      return res;
> @@ -1235,7 +1382,7 @@ int __init make_hypervisor_node(struct domain *d,
> 
>          ext_regions->max_banks = NR_MEM_BANKS;
> 
> -        if ( is_domain_direct_mapped(d) )
> +        if ( domain_use_host_layout(d) )
>          {
>              if ( !is_iommu_enabled(d) )
>                  res = find_host_extended_regions(kinfo, ext_regions);
> @@ -2164,8 +2311,11 @@ static int __init construct_dom0(struct domain *d)
>      /* type must be set before allocate_memory */
>      d->arch.type = kinfo.type;
>  #endif
> -    allocate_memory_11(d, &kinfo);
>      find_gnttab_region(d, &kinfo);
This re-ordering should be mentioned in commit msg.

> +    if ( is_domain_direct_mapped(d) )
> +        allocate_memory_11(d, &kinfo);
> +    else
> +        allocate_memory(d, &kinfo);
> 
>      rc = process_shm_chosen(d, &kinfo);
>      if ( rc < 0 )
> diff --git a/xen/arch/arm/include/asm/domain_build.h 
> b/xen/arch/arm/include/asm/domain_build.h
> index e712afbc7f..b0d646e173 100644
> --- a/xen/arch/arm/include/asm/domain_build.h
> +++ b/xen/arch/arm/include/asm/domain_build.h
> @@ -11,6 +11,7 @@ bool allocate_domheap_memory(struct domain *d, paddr_t 
> tot_size,
>                               alloc_domheap_mem_cb cb, void *extra);
>  bool allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
>                            paddr_t tot_size);
> +void allocate_memory(struct domain *d, struct kernel_info *kinfo);
>  int construct_domain(struct domain *d, struct kernel_info *kinfo);
>  int domain_fdt_begin_node(void *fdt, const char *name, uint64_t unit);
>  int make_chosen_node(const struct kernel_info *kinfo);
> @@ -54,6 +55,9 @@ static inline int prepare_acpi(struct domain *d, struct 
> kernel_info *kinfo)
>  int prepare_acpi(struct domain *d, struct kernel_info *kinfo);
>  #endif
> 
> +typedef int (*add_free_regions_fn)(unsigned long s_gfn, unsigned long e_gfn,
> +                                   void *data);
Random code? Remove.

With the remarks addressed:
Reviewed-by: Michal Orzel <michal.orzel@xxxxxxx>

~Michal




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.