[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/arm: skip holes in physical address space when setting up frametable


  • To: Julien Grall <julien@xxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: "Orzel, Michal" <michal.orzel@xxxxxxx>
  • Date: Thu, 30 Apr 2026 10:42:57 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=xen.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+JHC7x+39OR/WKJ+HCK0KVv7pkbu7w3ghxIhbpjgLIM=; b=ECos6BCeQ2KwmWZ76bEE/scwyZZWxYE7vHe1O8dcVmqbYasuy5fIrHGdyoDBu0R1LRZCsfr3KPVi6g87Bm3THfrrIlwk8CriiWf89zmnC3ohh440KYQzw8SVNIXM7/VcK607U7G28K3JBFkTsUuSXAReuhJqOqHAqedPV1kso8piQ0cNxkgqj/4tK/MRY4aQGa22RCoo4caIxxJO0YPV4VJilZaM9N50TaFyM9OIRHwBbuAD1AdVNUghM7lpzSJBlXy64/pIc3ZKpio4HY+fTsi+tdnAY1QGjntAmC8ChtWzDnjg8Q15HdjPjiHLkGgUf8GfD7VWFtVf/j144EFjRQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=VOZkYm9qeKj6mBNutGeywT3/8qWLOVq/33wiIN9oCdKKBf6FO0MKNCx2w3NVYRd/gxk4jHKZWJynMIpS6NELsdrhD4SpPtJc/X8vkDZoYHXOa7hJiChgLzqloCRqdpFYeKqJkgA42qCvM17WqimMdDhiFhajca3VAKP8w67ugXVw6C1j5hcs0LTYB0wNmt6TVfG7+7eqv5Hf+h9NdOskBTOnZgEWMosUXA2KzuU7NvL6vEmxiNIqzx0oS2bFLaWIeSsvPQV3cEue3qSBxUreA4NWoMCeXq9NEBxR4t84E3mfrwuZe9l8uF50wSljNkmwMmWGNiJuj36LiYRdkNhHTA==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=amd.com header.i="@amd.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Luca Fancellu <Luca.Fancellu@xxxxxxx>
  • Delivery-date: Thu, 30 Apr 2026 08:43:15 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


On 29-Apr-26 16:42, Julien Grall wrote:
> Hi Michal,
> 
> On 22/04/2026 09:25, Orzel, Michal wrote:
>>
>>
>> On 22/04/2026 09:01, Julien Grall wrote:
>>> Hi Michal,
>>>
>>> On 17/04/2026 10:11, Michal Orzel wrote:
>>>> Refactor setup_frametable_mappings() into init_frametable(), modeled
>>>> after x86's implementation. Instead of mapping one contiguous frametable
>>>> covering ram_start to ram_end (including holes), iterate the
>>>> pdx_group_valid bitmap to allocate and map frametable memory only for
>>>> valid PDX groups, skipping gaps in the physical address space. At the
>>>> moment we don't really take into account pdx_group_valid bitmap.
>>>>
>>>> This reduces memory consumption on systems with sparse RAM layouts by
>>>> not allocating frametable entries for non-existent memory regions.
>>>>
>>>> A file-local pdx_to_page() override is needed because the generic macro
>>>> in xen/include/xen/pdx.h does not account for ARM's non-zero
>>>> frametable_base_pdx.
>>>
>>> Can you provide a bit more details? I am a bit concerned that this could
>>> result to subttle bug in the future if code within mm.c is expecting the
>>> original behavior. It would be preferable if the change is either for
>>> everyone on Arm or the function is renamed to avoid any clash.
>> The generic pdx_to_page macro does not account for offset which is something 
>> I
>> mentioned in the footer and I'm willing to work on in the future.
> 
> Sorry I missed the comment in the footer. But if the function is broken, 
> then why can't we implement pdx_to_page() correctly now? I understand 
> that ...
I wanted to do this in the future but ok, will do in v2.

> 
>   As of today,
>> this macro is *unused* on Arm. It's only used by x86 in some special big mem
>> related scenario. Using generic pdx_to_page on Arm would be wrong, so a 
>> future
>> patch doing that would be wrong (the fact that this patch adds a local 
>> redefine
>> does not change anything). Do we need a rename for a local redefine in a file
>> that is only related to frametable? Maybe a comment and a TODO would be ok?
> 
> ... this is not meant to be used by Arm today. But given this is used in 
> the page list, it is definitely not obvious that it is broken.
> 
> The alternative is to protect/move pdx_to_page() in x86. But I don't 
> know much churn this would involve.
> 
>>
>>>
>>> [...]
>>>
>>>> +void __init init_frametable(paddr_t ram_start)
>>>> +{
>>>> +    unsigned int sidx, nidx, max_idx;
>>>>    
>>>>        /*
>>>>         * The size of paddr_t should be sufficient for the complete range 
>>>> of
>>>> @@ -26,24 +47,34 @@ void __init setup_frametable_mappings(paddr_t ps, 
>>>> paddr_t pe)
>>>>        BUILD_BUG_ON((sizeof(paddr_t) * BITS_PER_BYTE) < PADDR_BITS);
>>>>        BUILD_BUG_ON(sizeof(struct page_info) != PAGE_INFO_SIZE);
>>>>    
>>>> -    if ( frametable_size > FRAMETABLE_SIZE )
>>>> -        panic("The frametable cannot cover the physical region 
>>>> %#"PRIpaddr" - %#"PRIpaddr"\n",
>>>> -              ps, pe);
>>>> +    max_idx = DIV_ROUND_UP(max_pdx, PDX_GROUP_COUNT);
>>>> +    frametable_base_pdx = mfn_to_pdx(maddr_to_mfn(ram_start));
>>>>    
>>>> -    frametable_base_pdx = mfn_to_pdx(maddr_to_mfn(ps));
>>>> -    /* Round up to 2M or 32M boundary, as appropriate. */
>>>> -    frametable_size = ROUNDUP(frametable_size, mapping_size);
>>>> -    base_mfn = alloc_boot_pages(frametable_size >> PAGE_SHIFT, 
>>>> 32<<(20-12));
>>>> +    /*
>>>> +     * pdx_to_page(pdx_s) in init_frametable_chunk must be page-aligned
>>>> +     * for map_pages_to_xen(). Aligning to PDX_GROUP_COUNT guarantees this
>>>> +     * because PDX_GROUP_COUNT * sizeof(page_info) is always a multiple of
>>>> +     * PAGE_SIZE by construction.
>>>> +     */
>>>> +    frametable_base_pdx = ROUNDDOWN(frametable_base_pdx, PDX_GROUP_COUNT);
>>>>    
>>>> -    rc = map_pages_to_xen(FRAMETABLE_VIRT_START, base_mfn,
>>>> -                          frametable_size >> PAGE_SHIFT,
>>>> -                          PAGE_HYPERVISOR_RW | _PAGE_BLOCK);
>>>> -    if ( rc )
>>>> -        panic("Unable to setup the frametable mappings.\n");
>>>> +    if ( (max_pdx - frametable_base_pdx) > FRAMETABLE_NR )
>>>> +        panic("Frametable too small\n");
>>>> +
>>>> +    for ( sidx = (frametable_base_pdx / PDX_GROUP_COUNT); ; sidx = nidx )
>>>> +    {
>>>> +        unsigned int eidx;
>>>> +
>>>> +        eidx = find_next_zero_bit(pdx_group_valid, max_idx, sidx);
>>>> +        nidx = find_next_bit(pdx_group_valid, max_idx, eidx);
>>>> +
>>>> +        if ( nidx >= max_idx )
>>>> +            break;
>>>> +
>>>> +        init_frametable_chunk(sidx * PDX_GROUP_COUNT, eidx * 
>>>> PDX_GROUP_COUNT);
>>>
>>> The function will do a round-up the mapping to either a 2MiB or 32MiB
>>> aligned size. This means we could potentially cover the previous mapped
>>> region or the next one. I can't seem to find any code to cover this
>>> use-case. What did I miss?
>> Hmm, I think I calculated something wrong here. Anyway, how about using 2MB
>> mapping size all the time? PDX group size is 2MB,
> 
> Looking at the code, it seems to be based on SECOND_SHIFT which 
> technically depends on the page granularity. Even though Xen supports 
> only 4KiB, we are trying to avoid making such assumption or add least 
> adding a BUILD_BUG_ON() (in this case, I would consider that 
> PDX_GROUP_COUNT is always 2MiB or SECOND_SHIFT).
> 
>> in-loop chunks are multiple of
>> 2MB, there is no roundup needed - zero overshoot. The last chunk may have 
>> ~2MB
>> overshoot but it does not matter as there is nothing after it to conflict 
>> with.
>> The downside is more TLB pressure.
> 
> I am a bit warry to modify the frametable allocation method because it 
> is used fairly often in Xen. Would it be possible to hence the loop to 
> detect contiguous chunk and decide the size allocation based on the chunk?
To eliminate overshoot I'll round up to page size. In-loop chunks are already
page aligned so the round up is a no op except for the last chunk.

What we care about is to reduce TLB pressure by setting wherever possible
contiguous bit. It's not lost because map_pages_to_xen with block flag handles
contiguous internally via xen_pt_check_contig. Will do in v2.

~Michal




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.