[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Minios-devel] [PATCH 29/40] arm64: init the memory system before console/xenbus/gic



On Wed, Nov 08, 2017 at 10:38:09AM +0000, Julien Grall wrote:
> Hi Shijie,
> 
> On 08/11/17 08:54, Huang Shijie wrote:
> > On Mon, Nov 06, 2017 at 05:06:58PM +0000, Julien Grall wrote:
> >> Hi Shijie,
> >>
> >> On 03/11/17 03:12, Huang Shijie wrote:
> >>> The mappings for console/xenbus/gic need several pages to setup
> >>> the page tables.
> >>>
> >>> So we init the memory system before initiating the console/xenbus/gic,
> >>> and then we can allocate the pages which are needed by
> >>> the console/xenbus/gic mapping.
> >>>
> >>> we use the 2M memory block for the memory.
> >>
> >> Why 2MB and not 1GB mapping when possible? Also, what if the guest is using
> > 2MB is a proper size for most of the memory. For example, we can use 2M
> > for 512MB, it is not proper to use 1GB mapping.
> 
> And you can be clever and mixing 2MB/1GB mapping depending on what you need 
> to do.
Yes, we can. but the code looks a little ugly.

> 
> >> less than 2MB of memory?
> > Please increase the guest memory, even the mini-os itself is nearly 2MB.
The mini-os is 169K, we can use one page entry (which is 2M) to mapping
it.

> 
> Really? I can't build mini-os arm64 due to compilation issue:
> 
> In file included from dtc/libfdt/fdt.c:54:0:
> mini-os/include/libfdt.h:54:24: fatal error: libfdt_env.h: No such file or 
> directory
>  #include <libfdt_env.h>
The DTC is updated recently, so I also meet this issue today.

>                         ^
> But building x86, the resulting image is only 320K. How come the arm64 image
> is nearly 10 times bigger?
My memory is wrong..

Thanks
Huang Shijie
> >>> +
> >>> +static void build_pud(lpae_t *pgd, unsigned long vaddr, unsigned long 
> >>> vend,
> >>
> >> Is vend included or excluded from the range mapped?
> > excluded.
> >>
> >>> +                      paddr_t phys, long mem_type,
> >>
> >> Why the mem_type is long?
> > I like the long type. :)
> > Why use other type?
> 
> Whether you like or not mem_type, as the name says, is a type. Type are not 
> signed.
> The way you use it all sort of problem arise when you use shift or the value
> needs to be converted to a bigger type.
> 
> Now, you or it in the page-table. This means you expect that value to contain 
> some
> information for the LPAE entry. An entry is 64-bit. So it should be uint64_t.
okay, I can change it to uint64_t..

> 
> > 
> >>
> >>> +                      paddr_t (*new_page)(void), int level)
> >>> +{
> >>> +    lpae_t *pud;
> >>> +
> >>> +    pud = (lpae_t *)to_virt((*pgd) & ~ATTR_MASK_L) + l2_pgt_idx(vaddr);
> >>> +    do {
> >>> +        if (level == 2)
> >>> +             set_pgt_entry(pud, (phys & L2_MASK) | mem_type | L2_BLOCK);
> >>> +        vaddr += L2_SIZE;
> >>> +        phys += L2_SIZE;
> >>> +        pud++;
> >>
> >> I am not sure to understand why you handle the third level of page-table in
> > The #30 is for page-size mapping, this one is for block mapping.
> > 
> > The two patches are used for different purpose.
> 
> While the purpose of using 4KB and 2MB mappings may be different in the 
> Mini-OS
> context. Page-table are just page-table. You have helpers to build them and 
> they
> will not changed. If they need, then you clearly did something wrong.
> 
> Furthermore, as I already said one patch to handle all levels of page-table is
> just easier to review.
okay, I can merge the page table operations into one patch..

> 
> [...]
> 
> >>
> >>> +                     unsigned long max_pfn, long mem_type,
> >>> +                     paddr_t (*new_page)(void), int level)
> >>> +{
> >>> +    paddr_t p_start;
> >>> +    unsigned long v_end, next;
> >>> +    lpae_t *pgd;
> >>> +
> >>> +    v_end = vaddr + max_pfn * PAGE_SIZE;
> >>> +    p_start = PFN_PHYS(start_pfn);
> >>> +
> >>> +    /* The boot_l1_pgtable can span 512G address space */
> >>> +    pgd = &boot_l1_pgtable[l1_pgt_idx(vaddr)];
> >>> +
> >>> +    do {
> >>> +        next = (vaddr + L1_SIZE);
> >>> +        if (next > v_end)
> >>> +            next = v_end;
> >>> +
> >>> +        if ((*pgd) == L1_INVAL) {
> >>> +            set_pgt_entry(pgd, (new_page()) | PT_PT);
> >>
> >> What if new_page() fails?
> > It will call the BUG() when it fails, please see alloc_new_page()..
> 
> I looked at the only caller in this patch and there are no BUG()...
> 
> So your interface is already wrong (see below). But a BUG() in 
> alloc_new_page()
> seem pretty odd. You want to let the caller decides what to do if the 
> allocation
> fails? In fact even the x86 do return an error on allocation failure. So why 
> not
> Arm?
okay, I will add the return check for it...
> 
> >>
> >>> +        }
> >>> +
> >>> +        build_pud(pgd, vaddr, next, p_start, mem_type, new_page, level);
> >>> +        p_start += next - vaddr;
> >>> +        vaddr = next;
> >>> +        pgd++;
> >>> +    } while (vaddr != v_end);
> >>> +}
> >>
> >> The code you implemented for page-table looks far too simple. For instance,
> >> I am surprised there are no TLB maintenance anywhere.
> > We can add the TLB maintenance in future in the places it needed.
> 
> When is that future? As I said previously, page-table are going to be used
> in multiple difference places. Implementing the page-table correctly from
> the beginning does not hurt and will save time later on for all of us later
> on.
 1.) we flush the TLB in the _start.
 2.) We setup the table after 1.)
 3.) we will not change the table in future.

 Why do we need the TLB maintainable?

 Thanks

> 
> >>
> >>> +
> >>> +static unsigned long first_free_pfn;
> >>
> >> Likely this variable and get_new_page needs more explanation in the code.
> > okay, I stole this code from the x86 in mini-os.
> 
> Sorry but I can't find this code in x86... The variable is here but seem to
> be used differently. Would you mind giving a pointer?
in arch/x86/mm.c, search first_free_pfn.
> 
> > 
> >>
> >>> +static paddr_t get_new_page(void)
> >>
> >> I think it would be better if you had early (or something similar) in the
> >> name. To make clear that is only used for early boot.
> > What's about get_new_page_early or get_new_page_boot?
> 
> I would prefer early_alloc_page()
okay.
> 
> > 
> >>
> >>> +{
> >>> +    paddr_t new_page;
> >>> +
> >>> +    memset(pfn_to_virt(first_free_pfn), 0, PAGE_SIZE);
> >>
> >> This is quite confusing, get_new_page() allocates page-table that
> >> will be used to map the RAM. So how do you bootstrap it?
> > sorry, I do not quite understand your meaning..
> > 
> > Do you worry about the virtual address of first_free_pfn?
> 
> Yes. Where does the mapping between first_free_pfn and its virtual address
> is done?
In the __setup_initial_pgtable.
> 
> > 
> >>
> >>> +    dsb(ishst);
> >>> +
> >>> +    new_page = PFN_PHYS(first_free_pfn);
> >>> +    first_free_pfn++;
> >>
> >> What if you exhaust in memory? AFAICT, the user will see a failure like a
> > It will never happen. Each page will used for 1G. If we have 4G memory,
> > we will only use 4 page.
> 
> The practice is often different from the theory... You make your assumption
> based on the current use and now how it could potentially be used.
> 
> There is strictly no harm to add an ASSERT() checking this assumption. This
> will save people a lot of debugging later on.
Okay, I will add ASSERT here...

Thanks
Huang Shijie

_______________________________________________
Minios-devel mailing list
Minios-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/cgi-bin/mailman/listinfo/minios-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.