[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/3] x86/boot: Explain discard_initial_images() and untangle PV initrd handling


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 29 Apr 2024 13:41:56 +0200
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Daniel Smith <dpsmith@xxxxxxxxxxxxxxxxxxxx>, Christopher Clark <christopher.w.clark@xxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 29 Apr 2024 11:42:03 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 26.04.2024 16:01, Andrew Cooper wrote:
> discard_initial_images() frees the memory backing the boot modules.  It is
> called by dom0_construct_pv{,h}(), but obtains the module list by global
> pointer which is why there is no apparent link with the initrd pointer.
> 
> Generally, module contents are copied into dom0 as it's being built, but the
> initrd for PV dom0 might be directly mapped instead of copied.
> 
> dom0_construct_pv() does it's own ad-hoc freeing of the module in the copy
> case, and points the global reference at the new copy, then sets the size to
> 0.  This only functions because init_domheap_pages(x, x) happens to be a nop.
> 
> Delete the ad-hoc freeing, and leave it to discard_initial_images().  This
> requires (not) adjusting initd->mod_start in the copy case, and only setting
> the size to 0 in the mapping case.
> 
> Alter discard_initial_images() to explicitly check for an ignored module, and
> explain what's going on.  This is more robust and will allow for fixing other
> problems with module handling.
> 
> The later logic in dom0_construct_pv() now cannot use initrd->mod_start, but
> that's fine because initrd_mfn is already a duplicate of the information
> wanted, and is more consistent with initrd_{pfn,len} used elsewhere.
> 
> Invalidate the initrd pointer with LIST_POISON1 to make it clearer that it
> shouldn't be used.
> 
> No practical change in behaviour, but a substantial reduction in the
> complexity of how this works.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> ---
> CC: Jan Beulich <JBeulich@xxxxxxxx>
> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> CC: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> CC: Daniel Smith <dpsmith@xxxxxxxxxxxxxxxxxxxx>
> CC: Christopher Clark <christopher.w.clark@xxxxxxxxx>
> 
> In other akward questions, why does initial_images_nrpages() account for all
> modules when only 1 or 2 are relevant for how we construct dom0 ?
> ---
>  xen/arch/x86/pv/dom0_build.c | 22 +++++++++++-----------
>  xen/arch/x86/setup.c         |  9 ++++++++-
>  2 files changed, 19 insertions(+), 12 deletions(-)
> 
> diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
> index d8043fa58a27..64d9984b8308 100644
> --- a/xen/arch/x86/pv/dom0_build.c
> +++ b/xen/arch/x86/pv/dom0_build.c
> @@ -630,18 +630,20 @@ int __init dom0_construct_pv(struct domain *d,
>                  }
>              memcpy(page_to_virt(page), mfn_to_virt(initrd->mod_start),
>                     initrd_len);
> -            mpt_alloc = (paddr_t)initrd->mod_start << PAGE_SHIFT;
> -            init_domheap_pages(mpt_alloc,
> -                               mpt_alloc + PAGE_ALIGN(initrd_len));
> -            initrd->mod_start = initrd_mfn = mfn_x(page_to_mfn(page));
> +            initrd_mfn = mfn_x(page_to_mfn(page));
>          }
>          else
>          {
>              while ( count-- )
>                  if ( assign_pages(mfn_to_page(_mfn(mfn++)), 1, d, 0) )
>                      BUG();
> +            /*
> +             * Mapped rather than copied.  Tell discard_initial_images() to
> +             * ignore it.
> +             */
> +            initrd->mod_end = 0;
>          }
> -        initrd->mod_end = 0;
> +        initrd = LIST_POISON1; /* No longer valid to use. */
>  
>          iommu_memory_setup(d, "initrd", mfn_to_page(_mfn(initrd_mfn)),
>                             PFN_UP(initrd_len), &flush_flags);
> @@ -653,12 +655,10 @@ int __init dom0_construct_pv(struct domain *d,
>      if ( domain_tot_pages(d) < nr_pages )
>          printk(" (%lu pages to be allocated)",
>                 nr_pages - domain_tot_pages(d));
> -    if ( initrd )
> -    {
> -        mpt_alloc = (paddr_t)initrd->mod_start << PAGE_SHIFT;
> +    if ( initrd_len )
>          printk("\n Init. ramdisk: %"PRIpaddr"->%"PRIpaddr,
> -               mpt_alloc, mpt_alloc + initrd_len);
> -    }
> +               pfn_to_paddr(initrd_mfn),
> +               pfn_to_paddr(initrd_mfn) + initrd_len);
>  
>      printk("\nVIRTUAL MEMORY ARRANGEMENT:\n");
>      printk(" Loaded kernel: %p->%p\n", _p(vkern_start), _p(vkern_end));

Between what this and the following hunk touch there is

        if ( count < initrd_pfn || count >= initrd_pfn + PFN_UP(initrd_len) )
            mfn = pfn++;
        else
            mfn = initrd_mfn++;

I can't help thinking that this invalidates ...

> @@ -881,7 +881,7 @@ int __init dom0_construct_pv(struct domain *d,
>          if ( pfn >= initrd_pfn )
>          {
>              if ( pfn < initrd_pfn + PFN_UP(initrd_len) )
> -                mfn = initrd->mod_start + (pfn - initrd_pfn);
> +                mfn = initrd_mfn + (pfn - initrd_pfn);
>              else
>                  mfn -= PFN_UP(initrd_len);
>          }

... the use of the variable here.

> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -294,7 +294,7 @@ unsigned long __init initial_images_nrpages(nodeid_t node)
>      return nr;
>  }
>  
> -void __init discard_initial_images(void)
> +void __init discard_initial_images(void) /* a.k.a. free multiboot modules */
>  {
>      unsigned int i;
>  
> @@ -302,6 +302,13 @@ void __init discard_initial_images(void)
>      {
>          uint64_t start = (uint64_t)initial_images[i].mod_start << PAGE_SHIFT;
>  
> +        /*
> +         * Sometimes the initrd is mapped, rather than copied, into dom0.
> +         * end=0 signifies that we should leave it alone.
> +         */
> +        if ( initial_images[i].mod_end == 0 )
> +            continue;
> +
>          init_domheap_pages(start,
>                             start + PAGE_ALIGN(initial_images[i].mod_end));
>      }

While I don't strictly mind the addition, it isn't really needed, as calling
init_domheap_pages() with twice the same address is simply a no-op (and
.mod_end being 0 had to work correctly already before anyway).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.