[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/setup: do not relocate below the end of current Xen image placement



On Mon, Nov 27, 2017 at 04:58:52PM +0000, Andrew Cooper wrote:
> On 27/11/17 15:41, Daniel Kiper wrote:
> > If it is possible we would like to have the Xen image higher than the
> > booloader put it and certainly do not overwrite the Xen code and data
> > during copy/relocation. Otherwise the Xen may crash silently at boot.
> >
> > Signed-off-by: Daniel Kiper <daniel.kiper@xxxxxxxxxx>
> 
> Actually, there is a second related bug which I've only just got to the
> bottom of, and haven't had time to fix yet.
> 
> (XEN) Bootloader: GRUB 2.02
> (XEN) Command line: hvm_fep com1=115200,8n1 console=com1,vga
> dom0_mem=2048M,max:2048M watchdog ucode=scan dom0_max_vcpus=8
> crashkernel=192M,below=4G
> (XEN) Xen image load base address: 0xaba00000
> (XEN) Video information:
> (XEN)  VGA is text mode 80x25, font 8x16
> (XEN)  VBE/DDC methods: none; EDID transfer time: 0 seconds
> (XEN)  EDID info not retrieved because no DDC retrieval method detected
> (XEN) Disc information:
> (XEN)  Found 1 MBR signatures
> (XEN)  Found 1 EDD information structures
> (XEN) Xen-e820 RAM map:
> (XEN)  0000000000000000 - 000000000009bc00 (usable)
> (XEN)  000000000009bc00 - 00000000000a0000 (reserved)
> (XEN)  00000000000e0000 - 0000000000100000 (reserved)
> (XEN)  0000000000100000 - 00000000ac209000 (usable)
> (XEN)  00000000ac209000 - 00000000aeb99000 (reserved)
> (XEN)  00000000aeb99000 - 00000000aeb9d000 (ACPI NVS)
> (XEN)  00000000aeb9d000 - 00000000aecd1000 (reserved)
> (XEN)  00000000aecd1000 - 00000000aecd4000 (ACPI NVS)
> (XEN)  00000000aecd4000 - 00000000aecf5000 (reserved)
> (XEN)  00000000aecf5000 - 00000000aecf6000 (ACPI NVS)
> (XEN)  00000000aecf6000 - 00000000aed24000 (reserved)
> (XEN)  00000000aed24000 - 00000000aef2f000 (ACPI NVS)
> (XEN)  00000000aef2f000 - 00000000aefed000 (ACPI data)
> (XEN)  00000000aefed000 - 00000000af000000 (usable)
> (XEN)  00000000af000000 - 00000000b0000000 (reserved)
> (XEN)  00000000f8000000 - 00000000fc000000 (reserved)
> (XEN)  00000000fec00000 - 00000000fec01000 (reserved)
> (XEN)  00000000fed19000 - 00000000fed1a000 (reserved)
> (XEN)  00000000fed1c000 - 00000000fed20000 (reserved)
> (XEN)  00000000fee00000 - 00000000fee01000 (reserved)
> (XEN)  00000000ff400000 - 0000000100000000 (reserved)
> (XEN)  0000000100000000 - 0000000850000000 (usable)
> (XEN) Kdump: DISABLED (failed to reserve 192MB (196608kB) at 0xa0200000)
> (XEN) ACPI: RSDP 000F0410, 0024 (r2 INTEL )
> 
> When booting with Grub2 capable of locating Xen at its preferred
> location, the calculation for the kexec region fails, because the
> current location of Xen isn't taken into account.
> 
> The end of the chosen kexec region (0xa0200000 + 0x0c000000) overlaps
> with the bottom of the Xen image.

<blushes> Never got to upstream this, nor actually do the RFC thing
I mentioned..


commit 0350412917e7465fe5aaa3ba7616cf9bc6daa533
Author: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date:   Wed Dec 16 16:05:31 2015 -0500

    kexec/relocate: Change kexec location if relocation is in the way.
    
    The issue at hand is that GRUB2 puts us at the end of the
    E820_RAM that is under 4GB. It is aligned as well.
    For example :
    
    (XEN) Xen image base address: 0xeec00000
    ..
    (XEN) Xen-e820 RAM map:
    (XEN)  0000000000000000 - 000000000009fc00 (usable)
    (XEN)  000000000009fc00 - 00000000000a0000 (reserved)
    (XEN)  00000000000f0000 - 0000000000100000 (reserved)
    (XEN)  0000000000100000 - 00000000effff000 (usable)
    (XEN)  00000000effff000 - 00000000f0000000 (reserved)
    
    If the user decides to put the kexec crashkernel in the same
    area (so at the end of the E820_RAM) the relocation routines
    go haywire. For example with " crashkernel=512M@3327M"
    
    we would be usurping the end of the E820_RAM.
    
    This code doesn't actually fix the underlaying issue
    with the relocation routines (See below for explanation).
    Instead it just recomputes the location of where the kexec
    image should reside. With this patch we will have:
    
    (XEN) Kdump: 3327MB-3839MB overlaps with 0xeee00000(3822MB). Adjusting.
    (XEN) Kdump: 512MB (524288kB) at 0xcee00000->0xeee00000
    (XEN) New Xen image base address: 0xef800000
    
    The code assumes that the "new" relocation physical is always
    going to _after_ where GRUB has put the initial code.
    
    In other words - we always move it upwards in memory. But in this case
    there is no space (because kexec has grabbed it all) so we must move it
    downward (below where GRUB put us).
    
    That means subtracting the delta.. But since the value is
    an unsigned int that negative bit becomes 0xfffffff instead of -<some 
value>.
    Then the addition in the pagetables becomes quite large.
    
    However an RFC patch that fixes this didn't work right.
    
    OraBug: 22371625 - Xen hypervisor reallocation fails to allocate L3 
pagetables
    Acked-by: Adnan Misherfi <adnan.misherfi@xxxxxxxxxx>
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 21cbbce603..a5b4d6e427 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -473,12 +473,16 @@ static void __init parse_video_info(void)
         vga_console_info.u.vesa_lfb.mode_attrs = bvi->vesa_attrib;
     }
 }
-
+#define RELOCATE_BANDAID 1
 static void __init kexec_reserve_area(struct e820map *e820)
 {
     unsigned long kdump_start = kexec_crash_area.start;
     unsigned long kdump_size  = kexec_crash_area.size;
     static bool_t __initdata is_reserved = 0;
+#if RELOCATE_BANDAID
+    unsigned int i = 0;
+#endif
+    int rc = 0;
 
     kdump_size = (kdump_size + PAGE_SIZE - 1) & PAGE_MASK;
 
@@ -486,8 +490,37 @@ static void __init kexec_reserve_area(struct e820map *e820)
         return;
 
     is_reserved = 1;
+#if RELOCATE_BANDAID
+    while ( xen_img_base_phys_addr && i < e820->nr_map )
+    {
+        unsigned  long s = e820->map[i].addr;
+        unsigned  long e = (e820->map[i].addr + e820->map[i].size);
+
+        if ( e820->map[i++].type != E820_RAM )
+            continue;
+
+        if ( s <= xen_img_base_phys_addr && xen_img_base_phys_addr <= e )
+        {
+            unsigned long delta = xen_img_base_phys_addr - kdump_size;
+
+            if ( delta > s )
+            {
+                printk("Kdump: %luMB-%luMB overlaps with 0x%x(%uMB). 
Adjusting.\n",
+                       kdump_start >> 20 , (kdump_start + kdump_size) >> 20,
+                       xen_img_base_phys_addr, xen_img_base_phys_addr >> 20);
+                kexec_crash_area.start = delta;
+                kdump_start = delta;
+                break;
+            }
+            else
+                rc = -EINVAL;
+        }
+    }
+#endif
+    if ( rc == 0 )
+        rc = !reserve_e820_ram(e820, kdump_start, kdump_start + kdump_size);
 
-    if ( !reserve_e820_ram(e820, kdump_start, kdump_start + kdump_size) )
+    if ( rc )
     {
         printk("Kdump: DISABLED (failed to reserve %luMB (%lukB) at %#lx)"
                "\n", kdump_size >> 20, kdump_size >> 10, kdump_start);
@@ -495,8 +528,9 @@ static void __init kexec_reserve_area(struct e820map *e820)
     }
     else
     {
-        printk("Kdump: %luMB (%lukB) at %#lx\n",
-               kdump_size >> 20, kdump_size >> 10, kdump_start);
+        printk("Kdump: %luMB (%lukB) at %#lx->%#lx\n",
+               kdump_size >> 20, kdump_size >> 10, kdump_start,
+              kdump_start + kdump_size);
     }
 }
 
> 
> ~Andrew
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxxx
> https://lists.xenproject.org/mailman/listinfo/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.