[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
Hi Michal, On Fri, Dec 13, 2024 at 10:46 AM Michal Orzel <michal.orzel@xxxxxxx> wrote: > > Hi Carlo, Andrea, > > On 12/12/2024 19:22, Andrea Bastoni wrote: > > > > > > On 12/12/2024 18:48, Carlo Nonato wrote: > >> Hi, > >> > >> On Mon, Dec 9, 2024 at 8:17 PM Julien Grall <julien@xxxxxxx> wrote: > >>> > >>> Hi Michal, > >>> > >>> On 07/12/2024 15:04, Michal Orzel wrote: > >>>> > >>>> > >>>> On 06/12/2024 19:37, Julien Grall wrote: > >>>>> > >>>>> > >>>>> Hi, > >>>>> > >>>>> Sorry for the late answer. > >>>>> > >>>>> On 05/12/2024 09:40, Michal Orzel wrote: > >>>>>> > >>>>>> > >>>>>> On 02/12/2024 17:59, Carlo Nonato wrote: > >>>>>>> > >>>>>>> > >>>>>>> Cache coloring requires Dom0 not to be direct-mapped because of its > >>>>>>> non > >>>>>>> contiguous mapping nature, so allocate_memory() is needed in this > >>>>>>> case. > >>>>>>> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate > >>>>>>> module") > >>>>>>> moved allocate_memory() in dom0less_build.c. In order to use it > >>>>>>> in Dom0 construction bring it back to domain_build.c and declare it in > >>>>>>> domain_build.h. > >>>>>>> > >>>>>>> Take the opportunity to adapt the implementation of allocate_memory() > >>>>>>> so > >>>>>>> that it uses the host layout when called on the hwdom, via > >>>>>>> find_unallocated_memory(). > >>>>>>> > >>>>>>> Signed-off-by: Carlo Nonato <carlo.nonato@xxxxxxxxxxxxxxx> > >>>>>>> --- > >>>>>>> v11: > >>>>>>> - GUEST_RAM_BANKS instead of hardcoding the number of banks in > >>>>>>> allocate_memory() > >>>>>>> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory() > >>>>>>> - added a comment in allocate_memory() when skipping small banks > >>>>>>> v10: > >>>>>>> - fixed a compilation bug that happened when dom0less support was > >>>>>>> disabled > >>>>>>> v9: > >>>>>>> - no changes > >>>>>>> v8: > >>>>>>> - patch adapted to new changes to allocate_memory() > >>>>>>> v7: > >>>>>>> - allocate_memory() now uses the host layout when called on the hwdom > >>>>>>> v6: > >>>>>>> - new patch > >>>>>>> --- > >>>>>>> xen/arch/arm/dom0less-build.c | 44 ----------- > >>>>>>> xen/arch/arm/domain_build.c | 97 > >>>>>>> ++++++++++++++++++++++++- > >>>>>>> xen/arch/arm/include/asm/domain_build.h | 1 + > >>>>>>> 3 files changed, 94 insertions(+), 48 deletions(-) > >>>>>>> > >>>>>>> diff --git a/xen/arch/arm/dom0less-build.c > >>>>>>> b/xen/arch/arm/dom0less-build.c > >>>>>>> index d93a85434e..67b1503647 100644 > >>>>>>> --- a/xen/arch/arm/dom0less-build.c > >>>>>>> +++ b/xen/arch/arm/dom0less-build.c > >>>>>>> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void) > >>>>>>> return ( !dom0found && domUfound ); > >>>>>>> } > >>>>>>> > >>>>>>> -static void __init allocate_memory(struct domain *d, struct > >>>>>>> kernel_info *kinfo) > >>>>>>> -{ > >>>>>>> - struct membanks *mem = kernel_info_get_mem(kinfo); > >>>>>>> - unsigned int i; > >>>>>>> - paddr_t bank_size; > >>>>>>> - > >>>>>>> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for > >>>>>>> %pd:\n", > >>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */ > >>>>>>> - (unsigned long)(kinfo->unassigned_mem >> 20), d); > >>>>>>> - > >>>>>>> - mem->nr_banks = 0; > >>>>>>> - bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem); > >>>>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE), > >>>>>>> - bank_size) ) > >>>>>>> - goto fail; > >>>>>>> - > >>>>>>> - bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem); > >>>>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE), > >>>>>>> - bank_size) ) > >>>>>>> - goto fail; > >>>>>>> - > >>>>>>> - if ( kinfo->unassigned_mem ) > >>>>>>> - goto fail; > >>>>>>> - > >>>>>>> - for( i = 0; i < mem->nr_banks; i++ ) > >>>>>>> - { > >>>>>>> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" > >>>>>>> (%ldMB)\n", > >>>>>>> - d, > >>>>>>> - i, > >>>>>>> - mem->bank[i].start, > >>>>>>> - mem->bank[i].start + mem->bank[i].size, > >>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) > >>>>>>> */ > >>>>>>> - (unsigned long)(mem->bank[i].size >> 20)); > >>>>>>> - } > >>>>>>> - > >>>>>>> - return; > >>>>>>> - > >>>>>>> -fail: > >>>>>>> - panic("Failed to allocate requested domain memory." > >>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */ > >>>>>>> - " %ldKB unallocated. Fix the VMs configurations.\n", > >>>>>>> - (unsigned long)kinfo->unassigned_mem >> 10); > >>>>>>> -} > >>>>>>> - > >>>>>>> #ifdef CONFIG_VGICV2 > >>>>>>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo) > >>>>>>> { > >>>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c > >>>>>>> index 2c30792de8..2b8cba9b2f 100644 > >>>>>>> --- a/xen/arch/arm/domain_build.c > >>>>>>> +++ b/xen/arch/arm/domain_build.c > >>>>>>> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct > >>>>>>> domain *d, > >>>>>>> } > >>>>>>> } > >>>>>>> > >>>>>>> -#ifdef CONFIG_DOM0LESS_BOOT > >>>>>>> bool __init allocate_domheap_memory(struct domain *d, paddr_t > >>>>>>> tot_size, > >>>>>>> alloc_domheap_mem_cb cb, void > >>>>>>> *extra) > >>>>>>> { > >>>>>>> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct > >>>>>>> kernel_info *kinfo, gfn_t sgfn, > >>>>>>> > >>>>>>> return true; > >>>>>>> } > >>>>>>> -#endif > >>>>>>> > >>>>>>> /* > >>>>>>> * When PCI passthrough is available we want to keep the > >>>>>>> @@ -1003,6 +1001,94 @@ out: > >>>>>>> return res; > >>>>>>> } > >>>>>>> > >>>>>>> +void __init allocate_memory(struct domain *d, struct kernel_info > >>>>>>> *kinfo) > >>>>>>> +{ > >>>>>>> + struct membanks *mem = kernel_info_get_mem(kinfo); > >>>>>>> + unsigned int i, nr_banks = GUEST_RAM_BANKS; > >>>>>>> + paddr_t bank_start, bank_size; > >>>>>> Limit the scope > >>>>>> > >>>>>>> + struct membanks *hwdom_free_mem = NULL; > >>>>>>> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES; > >>>>>>> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES; > >>>>>> Limit the scope > >>>>>> > >>>>>>> + > >>>>>>> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for > >>>>>>> %pd:\n", > >>>>>>> + /* Don't want format this as PRIpaddr (16 digit hex) */ > >>>>>>> + (unsigned long)(kinfo->unassigned_mem >> 20), d); > >>>>>>> + > >>>>>>> + mem->nr_banks = 0; > >>>>>>> + /* > >>>>>>> + * Use host memory layout for hwdom. Only case for this is when > >>>>>>> LLC coloring > >>>>>>> + * is enabled. > >>>>>>> + */ > >>>>>>> + if ( is_hardware_domain(d) ) > >>>>>>> + { > >>>>>>> + ASSERT(llc_coloring_enabled); > >>>>>> This patch does not build because of declaration not being visible. > >>>>>> You must include <xen/llc-coloring.h>. > >>>>> > >>>>> Piggying back on this comment. AFAICT, the code below would work also in > >>>>> the non cache coloring case. So what's the assert is for? > >>>>> > >>>>>> > >>>>>>> + > >>>>>>> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank, > >>>>>>> + NR_MEM_BANKS); > >>>>>>> + if ( !hwdom_free_mem ) > >>>>>>> + goto fail; > >>>>>>> + > >>>>>>> + hwdom_free_mem->max_banks = NR_MEM_BANKS; > >>>>>>> + > >>>>>>> + if ( find_unallocated_memory(kinfo, hwdom_free_mem) ) > >>>>>> My remarks for the use of find_unallocated_memory() 1:1 have not been > >>>>>> addressed. You did not even > >>>>>> change the comments inside the function. The problem is that the > >>>>>> function is specifically designed > >>>>>> for finding extended regions and assumes being called at certain point > >>>>>> i.e. dom0 RAM allocated, gnttab > >>>>>> region allocated, etc. > >>>>> > >>>>> So I agree that the function should be updated if we plan to use it for > >>>>> other purpose. > >>>>> > >>>>> My opinion is that we should attempt to make the function generic so > >>>>> that in your > >>>>>> case you can choose which regions to exclude, define even your own > >>>>>> function to grab free regions (at the moment > >>>>>> add_ext_regions grabs banks >= 64M but you still discards banks >= > >>>>>> 128M, so it's a bit wasteful. > >>>>>> > >>>>>> My very short attempt to make the function as generic as possible in > >>>>>> the first iteration: > >>>>>> https://paste.debian.net/1338334/ > >>>>> > >>>>> This looks better, but I wonder why we need still need to exclude the > >>>>> static regions? Wouldn't it be sufficient to exclude just reserved > >>>>> regions? > >>>> Static shared memory banks are not part of reserved memory (i.e. > >>>> bootinfo.reserved_mem) if that's what you're asking. > >>>> They are stored in bootinfo.shmem, hence we need to take them into > >>>> account when searching for unused address space. > >>> > >>> Oh I missed the fact you now pass "mem_banks" as a parameter. I thought > >>> they would still get excluded for cache coloring case. > >>> > >>>> > >>>> If you and Carlo are ok with my proposed solution for making the > >>>> function generic, I can send a patch as a prerequisite > >>>> patch for Carlo series. > >>> > >>> I am fine with the approach. > >>> > >>> Cheers, > >>> > >>> -- > >>> Julien Grall > >>> > >> > >>> @@ -2152,7 +2238,10 @@ static int __init construct_dom0(struct domain *d) > >>> /* type must be set before allocate_memory */ > >>> d->arch.type = kinfo.type; > >>> #endif > >>> - allocate_memory_11(d, &kinfo); > >>> + if ( is_domain_direct_mapped(d) ) > >>> + allocate_memory_11(d, &kinfo); > >>> + else > >>> + allocate_memory(d, &kinfo); > >>> find_gnttab_region(d, &kinfo); > >> > >> Since find_gnttab_region() is called after allocate_memory(), > >> kinfo->gnttab_* > >> fields aren't initialized and the call to find_unallocated_memory() with > >> gnttab as the region to exclude, fails ending in a crash since memory for > >> dom0 can't be allocated. > >> > >> Can the solution be to call find_gnttab_region() before the above if? > > > > The function is called find, but currently it only initializes > > kinfo->gnttab_start > > and kinfo->gnttab_size and we tested that moving it before allocate_memory* > > doesn't > > cause fallouts. > > > > If moving before allocate_memory*, would it be better to rename it e.g., > > init_gnttab_region()? > > > > Thanks, > > Andrea > > > >> Or should I just call it before allocate_memory() in one case, but still > >> after > >> allocate_memory_11() in the other? > >> > >> Thanks. > > > > AFAICT there is nothing stopping us from moving find_gnttab_region() before > allocate_*. This function initializes > gnttab region with PA of Xen. In normal case, because Xen is added as > bootmodule, it will never be mapped in dom0 memory map > and the placement does not matter. In LLC case, it will point to relocated > address of Xen and it needs to be known before > calling find_unallocated_memory. Don't rename it, leave as is, just move > before allocate_*. > > @Carlo: > My prerequisite patch has been merged, so you're good to respin a series > (unless you wait for some feedback in which case do let me know). > To prevent too many respins, you're going to call find_unallocated_memory for > LLC passing resmem and gnttab to be excluded. If you're going > to reuse add_ext_regions, you need to rename it and fix comments to make it > more generic. As for the size, the decision is yours. One solution > would be to modify add_ext_regions to take min bank size as parameter (64MB > for extended regions, X for LLC dom0). In your code, you write that > the first bank must contain dom0, dtb, ramdisk and you chose 128MB. However, > looking at the code, you seem to discard banks < 128 for all the banks, > not only for the first one. This is the part that I don't have a ready > solution. Maybe you could define your own add_free_region function and sort > the banks, so that you take the largest possible bank first for dom0. This > could simplify things. For the moment I added a __add_ext_regions() helper that also takes a skip_size parameter. This is called by add_ext_regions() and by a new add_hwdom_free_regions() callback used in allocate_memory(). I still use 128MB for all the banks. Do you think this is acceptable, maybe with a FIXME comment cause we should skip only the first bank, or not? > You can also ask others for opinion. > > We are approaching Dec 20th deadline, and I want this series to be in as it's > been on the list for too many years. I'm willing to accept a sub-optimal > solution > (so far will be used only for LLC, and LLC as experimental feature will be > the only victim of not optimal algorithm) for now, and we can think of a > better one > after the release. But still, even the sub-optimal solution must make sense. > > ~Michal > Thanks.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |