[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions



arm32 currently only makes use of memory which is contiguous with the first
bank. On the Midway platform this means that we only use 4GB of the 8GB
available.

Change things to make use of non-contiguous memory regions with the
restriction that we require that at least half of the total span of the RAM
addresses contain RAM. The frametable is currently not sparse and so this
restriction avoids problems with allocating enormous amounts of memory for the
frametable to cover holes in the address space and exhausting the actual RAM.

50% is arguably too restrictive. 4GB of RAM requires 32MB of frametable on
arm32 and 56M on arm64, so we could probably cope with a lower ratio of actual
RAM. However half is nice and conservative.

arm64 currently uses all banks without regard for the size of the frametable,
which I have observed causing problems on models. Implement that same
restriction as arm32 there.

Long term we should look at moving to a pfn compression based scheme similar
to x86, which removes the holes from the frametable.

There were some bogus/outdated comments scattered around this code which I
have removed.

Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
Tested-by: Julien Grall <julien.grall@xxxxxxxxxx>
Acked-by: Julien Grall <julien.grall@xxxxxxxxxx>
Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
---
v2: Rebase over "avoid truncation in mfn to paddr conversions"

Freeze:

The benefit of this series is that we can use the full 8GB of RAM on the
midway systems, rather than being limited to just the first 4GB (less I/O
holes). I expect it to be common that server class 32-bit arm systems will
have a hole in memory (between a bank <4GB and one above).

The risk is that we regress on some other supported platform but AFAIK the
vexpress and sunxi only have a single memory bank and the arndale has
contiguous memory regions.

The ARM64 portion of this is a bug fix.
---
 xen/arch/arm/setup.c |  185 +++++++++++++++++++++++++++++++++-----------------
 1 file changed, 122 insertions(+), 63 deletions(-)

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 0795eb9..e640d81 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -360,35 +360,80 @@ static paddr_t __init get_xen_paddr(void)
 #ifdef CONFIG_ARM_32
 static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size)
 {
-    paddr_t ram_start;
-    paddr_t ram_end;
-    paddr_t ram_size;
+    paddr_t ram_start, ram_end, ram_size;
+    paddr_t contig_start, contig_end;
     paddr_t s, e;
     unsigned long ram_pages;
     unsigned long heap_pages, xenheap_pages, domheap_pages;
     unsigned long dtb_pages;
     unsigned long boot_mfn_start, boot_mfn_end;
-    int i = 0;
+    int i;
     void *fdt;
 
-    /* TODO: Handle non-contiguous memory bank */
     if ( !early_info.mem.nr_banks )
         early_panic("No memory bank\n");
-    ram_start = early_info.mem.bank[0].start;
+
+    /*
+     * We are going to accumulate two regions here.
+     *
+     * The first is the bounds of the initial memory region which is
+     * contiguous with the first bank. For simplicity the xenheap is
+     * always allocated from this region.
+     *
+     * The second is the complete bounds of the regions containing RAM
+     * (ie. from the lowest RAM address to the highest), which
+     * includes any holes.
+     *
+     * We also track the number of actual RAM pages (i.e. not counting
+     * the holes).
+     */
     ram_size  = early_info.mem.bank[0].size;
-    ram_end = ram_start + ram_size;
+
+    contig_start = ram_start = early_info.mem.bank[0].start;
+    contig_end   = ram_end = ram_start + ram_size;
 
     for ( i = 1; i < early_info.mem.nr_banks; i++ )
     {
-        if ( ram_end != early_info.mem.bank[i].start )
+        paddr_t bank_start = early_info.mem.bank[i].start;
+        paddr_t bank_size = early_info.mem.bank[i].size;
+        paddr_t bank_end = bank_start + bank_size;
+
+        paddr_t new_ram_size = ram_size + bank_size;
+        paddr_t new_ram_start = min(ram_start,bank_start);
+        paddr_t new_ram_end = max(ram_end,bank_end);
+
+        /*
+         * If the new bank is contiguous with the initial contiguous
+         * region then incorporate it into the contiguous region.
+         *
+         * Otherwise we allow non-contigious regions so long as at
+         * least half of the total RAM region actually contains
+         * RAM. We actually fudge this slightly and require that
+         * adding the current bank does not cause us to violate this
+         * restriction.
+         *
+         * This restriction ensures that the frametable (which is not
+         * currently sparse) does not consume all available RAM.
+         */
+        if ( bank_start == contig_end )
+            contig_end = bank_end;
+        else if ( bank_end == contig_start )
+            contig_start = bank_start;
+        else if ( 2 * new_ram_size < new_ram_end - new_ram_start )
+            /* Would create memory map which is too sparse, so stop here. */
             break;
 
-        ram_size += early_info.mem.bank[i].size;
-        ram_end += early_info.mem.bank[i].size;
+        ram_size = new_ram_size;
+        ram_start = new_ram_start;
+        ram_end = new_ram_end;
     }
 
     if ( i != early_info.mem.nr_banks )
-        early_printk("WARNING: some memory banks are not used\n");
+    {
+        early_printk("WARNING: only using %d out of %d memory banks\n",
+                     i, early_info.mem.nr_banks);
+        early_info.mem.nr_banks = i;
+    }
 
     total_pages = ram_pages = ram_size >> PAGE_SHIFT;
 
@@ -403,13 +448,14 @@ static void __init setup_mm(unsigned long dtb_paddr, 
size_t dtb_size)
      * We try to allocate the largest xenheap possible within these
      * constraints.
      */
-    heap_pages = (ram_size >> PAGE_SHIFT);
+    heap_pages = ram_pages;
     xenheap_pages = (heap_pages/8 + 0x1fffUL) & ~0x1fffUL;
     xenheap_pages = max(xenheap_pages, 128UL<<(20-PAGE_SHIFT));
 
     do
     {
-        e = consider_modules(ram_start, ram_end,
+        /* xenheap is always in the initial contiguous region */
+        e = consider_modules(contig_start, contig_end,
                              pfn_to_paddr(xenheap_pages),
                              32<<20, 0);
         if ( e )
@@ -433,9 +479,6 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t 
dtb_size)
     /*
      * Need a single mapped page for populating bootmem_region_list
      * and enough mapped pages for copying the DTB.
-     *
-     * TODO: The DTB (and other payloads) are assumed to be towards
-     * the start of RAM.
      */
     dtb_pages = (dtb_size + PAGE_SIZE-1) >> PAGE_SHIFT;
     boot_mfn_start = xenheap_mfn_end - dtb_pages - 1;
@@ -443,47 +486,51 @@ static void __init setup_mm(unsigned long dtb_paddr, 
size_t dtb_size)
 
     init_boot_pages(pfn_to_paddr(boot_mfn_start), pfn_to_paddr(boot_mfn_end));
 
-    /*
-     * Copy the DTB.
-     *
-     * TODO: handle other payloads too.
-     */
+    /* Copy the DTB. */
     fdt = mfn_to_virt(alloc_boot_pages(dtb_pages, 1));
     copy_from_paddr(fdt, dtb_paddr, dtb_size, BUFFERABLE);
     device_tree_flattened = fdt;
 
     /* Add non-xenheap memory */
-    s = ram_start;
-    while ( s < ram_end )
+    for ( i = 0; i < early_info.mem.nr_banks; i++ )
     {
-        paddr_t n = ram_end;
+        paddr_t bank_start = early_info.mem.bank[i].start;
+        paddr_t bank_end = bank_start + early_info.mem.bank[i].size;
 
-        e = next_module(s, &n);
-
-        if ( e == ~(paddr_t)0 )
+        s = bank_start;
+        while ( s < bank_end )
         {
-            e = n = ram_end;
-        }
+            paddr_t n = bank_end;
 
-        /* Module in RAM which we cannot see here, due to not handling
-         * non-contiguous memory regions yet
-         */
-        if ( e > ram_end )
-            e = ram_end;
+            e = next_module(s, &n);
 
-        /* Avoid the xenheap */
-        if ( s < pfn_to_paddr(xenheap_mfn_start+xenheap_pages)
-             && pfn_to_paddr(xenheap_mfn_start) < e )
-        {
-            e = pfn_to_paddr(xenheap_mfn_start);
-            n = pfn_to_paddr(xenheap_mfn_start+xenheap_pages);
-        }
+            if ( e == ~(paddr_t)0 )
+            {
+                e = n = ram_end;
+            }
+
+            /*
+             * Module in a RAM bank other than the one which we are
+             * not dealing with here.
+             */
+            if ( e > bank_end )
+                e = bank_end;
 
-        dt_unreserved_regions(s, e, init_boot_pages, 0);
+            /* Avoid the xenheap */
+            if ( s < pfn_to_paddr(xenheap_mfn_start+xenheap_pages)
+                 && pfn_to_paddr(xenheap_mfn_start) < e )
+            {
+                e = pfn_to_paddr(xenheap_mfn_start);
+                n = pfn_to_paddr(xenheap_mfn_start+xenheap_pages);
+            }
+
+            dt_unreserved_regions(s, e, init_boot_pages, 0);
 
-        s = n;
+            s = n;
+        }
     }
 
+    /* Frame table covers all of RAM region, including holes */
     setup_frametable_mappings(ram_start, ram_end);
     max_page = PFN_DOWN(ram_end);
 
@@ -499,8 +546,8 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t 
dtb_size)
 {
     paddr_t ram_start = ~0;
     paddr_t ram_end = 0;
+    paddr_t ram_size = 0;
     int bank;
-    unsigned long  xenheap_pages = 0;
     unsigned long dtb_pages;
     void *fdt;
 
@@ -508,23 +555,33 @@ static void __init setup_mm(unsigned long dtb_paddr, 
size_t dtb_size)
     for ( bank = 0 ; bank < early_info.mem.nr_banks; bank++ )
     {
         paddr_t bank_start = early_info.mem.bank[bank].start;
-        paddr_t bank_size  = early_info.mem.bank[bank].size;
-        paddr_t  bank_end = bank_start + bank_size;
-        unsigned long bank_pages = bank_size >> PAGE_SHIFT;
+        paddr_t bank_size = early_info.mem.bank[bank].size;
+        paddr_t bank_end = bank_start + bank_size;
         paddr_t s, e;
 
-        total_pages += bank_pages;
-
-        if ( bank_start < ram_start )
-            ram_start = bank_start;
-        if ( bank_end > ram_end )
-            ram_end = bank_end;
+        paddr_t new_ram_size = ram_size + bank_size;
+        paddr_t new_ram_start = min(ram_start,bank_start);
+        paddr_t new_ram_end = max(ram_end,bank_end);
+
+        /*
+         * We allow non-contigious regions so long as at least half of
+         * the total RAM region actually contains RAM. We actually
+         * fudge this slightly and require that adding the current
+         * bank does not cause us to violate this restriction.
+         *
+         * This restriction ensures that the frametable (which is not
+         * currently sparse) does not consume all available RAM.
+         */
+        if ( bank > 0 && 2 * new_ram_size < new_ram_end - new_ram_start )
+            /* Would create memory map which is too sparse, so stop here. */
+            break;
 
-        xenheap_pages += (bank_size >> PAGE_SHIFT);
+        ram_start = new_ram_start;
+        ram_end = new_ram_end;
+        ram_size = new_ram_size;
 
         setup_xenheap_mappings(bank_start>>PAGE_SHIFT, bank_size>>PAGE_SHIFT);
 
-        /* XXX we assume that the ram regions are ordered */
         s = bank_start;
         while ( s < bank_end )
         {
@@ -547,6 +604,15 @@ static void __init setup_mm(unsigned long dtb_paddr, 
size_t dtb_size)
         }
     }
 
+    if ( bank != early_info.mem.nr_banks )
+    {
+        early_printk("WARNING: only using %d out of %d memory banks\n",
+                     bank, early_info.mem.nr_banks);
+        early_info.mem.nr_banks = bank;
+    }
+
+    total_pages += ram_size >> PAGE_SHIFT;
+
     xenheap_virt_end = XENHEAP_VIRT_START + ram_end - ram_start;
     xenheap_mfn_start = ram_start >> PAGE_SHIFT;
     xenheap_mfn_end = ram_end >> PAGE_SHIFT;
@@ -554,17 +620,10 @@ static void __init setup_mm(unsigned long dtb_paddr, 
size_t dtb_size)
 
     /*
      * Need enough mapped pages for copying the DTB.
-     *
-     * TODO: The DTB (and other payloads) are assumed to be towards
-     * the start of RAM.
      */
     dtb_pages = (dtb_size + PAGE_SIZE-1) >> PAGE_SHIFT;
 
-    /*
-     * Copy the DTB.
-     *
-     * TODO: handle other payloads too.
-     */
+    /* Copy the DTB. */
     fdt = mfn_to_virt(alloc_boot_pages(dtb_pages, 1));
     copy_from_paddr(fdt, dtb_paddr, dtb_size, BUFFERABLE);
     device_tree_flattened = fdt;
-- 
1.7.10.4


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.