[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 07/11] xen/riscv: add Linux kernel loading support
- To: Jan Beulich <jbeulich@xxxxxxxx>
- From: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
- Date: Wed, 6 May 2026 13:57:17 +0200
- Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=20251104 header.d=gmail.com header.i="@gmail.com" header.h="Content-Transfer-Encoding:In-Reply-To:From:Content-Language:References:Cc:To:Subject:User-Agent:MIME-Version:Date:Message-ID"
- Cc: Romain Caritey <Romain.Caritey@xxxxxxxxxxxxx>, Alistair Francis <alistair.francis@xxxxxxx>, Connor Davis <connojdavis@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
- Delivery-date: Wed, 06 May 2026 11:57:22 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
On 5/4/26 4:05 PM, Jan Beulich wrote:
On 28.04.2026 16:33, Oleksii Kurochko wrote:
--- /dev/null
+++ b/xen/arch/riscv/kernel.c
@@ -0,0 +1,242 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/bug.h>
+#include <xen/compiler.h>
+#include <xen/errno.h>
+#include <xen/fdt-kernel.h>
+#include <xen/guest_access.h>
+#include <xen/init.h>
+#include <xen/libfdt/libfdt.h>
+#include <xen/mm.h>
+#include <xen/types.h>
+#include <xen/vmap.h>
+
+#include <asm/setup.h>
+
+#define IMAGE64_MAGIC_V2 0x05435352 /* Magic number 2, le, "RSC\x05" */
+
+static void __init place_modules(struct kernel_info *info, paddr_t kernbase,
+ paddr_t kernend)
+{
+ const struct boot_module *mod = info->bd.initrd;
+ const struct membanks *banks = kernel_info_get_mem_const(info);
+ const paddr_t initrd_len = ROUNDUP(mod ? mod->size : 0,
+ KERNEL_LOAD_ADDR_ALIGNMENT);
+ const paddr_t dtb_len = ROUNDUP(fdt_totalsize(info->fdt),
+ KERNEL_LOAD_ADDR_ALIGNMENT);
Why would modules need to be this strongly aligned?
No specific reason except to be aligned with similar alignment below, it
could be lesser (PAGE_SIZE or even just unsigned long aligned) or even
dropped, I think. It was just easier then to calculate aligned
addresses. But I don't see any big issue to have such alignments except
maybe that it will waste some memory.
+ const paddr_t modsize = initrd_len + dtb_len;
+ unsigned int bi = banks->nr_banks;
+
+ BUG_ON(modsize < initrd_len);
Where's the earlier check that allows this to be BUG_ON()?
There isn't one. I will replace that with:
if ( modsize < initrd_len )
panic("Module size overflow: initrd + dtb size wraps paddr_t\n");
+ /*
+ * Place modules as high in RAM as possible, scanning banks from
+ * last to first so that the end of the last bank is preferred.
+ */
+ while ( bi-- > 0 )
+ {
+ const struct membank *bank = &banks->bank[bi];
+ const paddr_t bank_end = bank->start + bank->size;
+ paddr_t modbase;
+
+ if ( modsize > bank->size )
+ continue;
+
+ modbase = ROUNDDOWN(bank_end - modsize, KERNEL_LOAD_ADDR_ALIGNMENT);
Same question here.
I used KERNEL_LOAD_ADDR_ALIGNMENT to be sure that big page tables be
potentially used in page table.
+ if ( modbase < bank->start )
+ continue;
+
+ /*
+ * If modules would overlap the kernel, try placing them below it.
+ */
With how kernel_image_place() works, and with the heavy alignment applied
above, is this even possible to succeed? Oh, wait, yes - for the not-
position-independent case.
I don't understand what is wrong with putting initrd and dtb to
basically any place except where kernel is expected to be placed (as in
case of position dependent case I assume it is pretty crucial).
+ if ( (modbase < ROUNDUP(kernend, KERNEL_LOAD_ADDR_ALIGNMENT)) &&
+ (modbase + modsize > kernbase) )
+ {
+ modbase = ROUNDDOWN(kernbase - modsize,
KERNEL_LOAD_ADDR_ALIGNMENT);
What prevents this subtraction from underflowing?
I will put the following check at the start of the place_modules() function:
if ( kernbase < modsize )
panic("Underflow could happen between kernbase and modsize\n");
+static paddr_t __init kernel_image_place(struct kernel_info *info)
+{
+ paddr_t load_addr = INVALID_PADDR;
+ uint64_t image_size = info->image.image_size ?: info->image.len;
+ const struct membanks *banks = kernel_info_get_mem_const(info);
+ unsigned int nr_banks = banks->nr_banks;
+ unsigned int bi;
+
+ dprintk(XENLOG_DEBUG, "nr_banks(%u)\n", nr_banks);
Did you mean to drop this before submitting?
Sure, it should be dropped.
+ /*
+ * At the moment, RISC-V's Linux kernel should be always position
+ * independent based on "Per-MMU execution" of boot.rst:
+ * https://docs.kernel.org/arch/riscv/boot.html#pre-mmu-execution
+ *
+ * But just for the case when RISC-V's Linux kernel isn't position
+ * independent it is needed to take load address from
+ * info->image.start.
+ *
+ * If `start` is zero, the Image is position independent.
+ */
+ if ( likely(!info->image.start) )
+ {
+ for ( bi = 0; bi != nr_banks; bi++ )
+ {
+ const struct membank *bank = &banks->bank[bi];
+ paddr_t bank_start = bank->start;
+ /*
+ * According to boot.rst kernel load address should be properly
+ * aligned:
+ * https://docs.kernel.org/arch/riscv/boot.html#kernel-location
+ *
+ * As Image in this case is PIC we can ignore
+ * info->image.text_offset.
+ */
+ paddr_t aligned_start = ROUNDUP(bank_start,
KERNEL_LOAD_ADDR_ALIGNMENT);
+ paddr_t bank_end = bank_start + bank->size;
+ paddr_t bank_size;
+
+ if ( aligned_start > bank_end )
+ continue;
+
+ bank_size = bank_end - aligned_start;
+
+ dprintk(XENLOG_DEBUG, "bank[%u].start=%"PRIpaddr"\n", bi,
bank->start);
And this one? (I also find it puzzling that ->start would be of (primary)
interest
here, when ...
+ if ( image_size <= bank_size )
... bank_size is what's relevant.
This one should be dropped too.
+/* Check if the image is a 64-bit Image */
+static int __init kernel_image64_probe(struct kernel_info *info,
+ paddr_t addr, paddr_t size)
+{
+ /* https://www.kernel.org/doc/Documentation/riscv/boot-image-header.rst */
+ struct {
+ uint32_t code0; /* Executable code */
+ uint32_t code1; /* Executable code */
+ uint64_t text_offset; /* Image load offset, little endian */
+ uint64_t image_size; /* Effective Image size, little endian */
+ uint64_t flags; /* kernel flags, little endian */
+ uint32_t version; /* Version of this header */
+ uint32_t res1; /* Reserved */
+ uint64_t res2; /* Reserved */
+ uint64_t magic; /* Deprecated: Magic number, little endian,
"RISCV" */
+ uint32_t magic2; /* Magic number 2, little endian, "RSC\x05" */
+ uint32_t res3; /* Reserved for PE COFF offset */
+ } image;
+ uint64_t effective_size;
+
+ if ( size < sizeof(image) )
+ return -EINVAL;
+
+ copy_from_paddr(&image, addr, sizeof(image));
+
+ /* Magic v1 is deprecated and may be removed. Only use v2 */
+ if ( le32_to_cpu(image.magic2) != IMAGE64_MAGIC_V2 )
+ return -EINVAL;
+
+ effective_size = le64_to_cpu(image.image_size);
+
+ if ( effective_size && size > effective_size )
+ return -EINVAL;
Is the rhs of the && the wrong way round? If effective_size > size,
aren't you in trouble? Question of course is what "effective" really
means. Yet in any event it seems dubious to me that effective_size <
size would really be a problem. IOW this will want commenting upon
if the check is to stay.
Actually ...
+ info->image.kernel_addr = addr;
+ /* Actual size in the binary file */
+ info->image.len = size;
+ /* Total memory the kernel occupies at runtime */
+ info->image.image_size = effective_size;
... this looks to suggest something .bss-like.
Yes, effective_size it is size which included .bss.
size it of LK after decompression of Image.gz and it doesn't include
.bss so it should be lesser then effective_size.
I don't think that I am in trouble that effective_size is bigger then
size if we allocate enough space in memory effective_size is fine to be
bigger.
It is a good question if effective_size < size is a problem. I think it
isn't but could it be really happen?
I think that I am okay to drop that part of if().
--- a/xen/include/xen/fdt-kernel.h
+++ b/xen/include/xen/fdt-kernel.h
@@ -59,8 +59,15 @@ struct kernel_info {
struct {
paddr_t kernel_addr;
paddr_t len;
-#if defined(CONFIG_ARM_64) || defined(CONFIG_RISCV_64)
- paddr_t text_offset; /* 64-bit Image only */
+#if defined(CONFIG_ARM_64) || defined(CONFIG_RISCV)
+ /*
+ * ARM: 64-bit Image only.
+ * RISC-V: both 32-bit and 64-bit Images.
+ */
+ paddr_t text_offset;
+#endif
+#if defined(CONFIG_RISCV)
+ uint64_t image_size; /* Effective size of Image */
As this (apparently) is for both RV64 and RV32 - can the latter really have
wider than 32-bit image sizes? If not - use size_t or unsigned long here?
Agree, unsigned long should be enough.
Thanks.
~ Oleksii
|