Xen project Mailing List

Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation

On 6/2/25 12:21 PM, Jan Beulich wrote:

On 26.05.2025 20:44, Oleksii Kurochko wrote:

On 5/22/25 4:46 PM, Jan Beulich wrote:

On 21.05.2025 18:03, Oleksii Kurochko wrote:

+    /* Allocate MMIO resource array */
+    imsic_cfg.mmios = xzalloc_array(struct imsic_mmios, nr_mmios);

How large can this and ...

+    if ( !imsic_cfg.mmios )
+    {
+        rc = -ENOMEM;
+        goto imsic_init_err;
+    }
+
+    imsic_cfg.msi = xzalloc_array(struct imsic_msi, nr_parent_irqs);

... this array grow (in principle)?

Roughly speaking, this is the number of processors. The highests amount of processors
on the market I saw it was 32. But it was over a year ago when I last checked this.

Unless there's an architectural limit, I don't think it's a good idea to
take as reference what's available at present. But yes, ...

This (32) is not an architectural limit. I assume that if mhartd id accepts a range from 0 to 2^64-1 for RV64 then I assume that the *theoretical* limit for amount of cpus is 2^64-1. And in RISC-V spec. I can't find if it is theoretical limit or not. But if look into AIA (interrupt controller) specification then it tells explicitly that limit is 16,384: 1.2 Limits In its current version, the RISC-V Advanced Interrupt Architecture can support RISC-V symmet-ric multiprocessing (SMP) systems with up to 16,384 harts. If the harts are 64-bit (RV64) and implement the hypervisor extension, and if all features of the Advanced Interrupt Architecture are fully implemented as well, then for each physical hart there may be up to 63 active virtual harts and potentially thousands of additional idle (swapped-out) virtual harts, where each virtual hart has direct control of one or more physical devices. Also 16,384 is used as a maximum for nr_parent_irqs from DTS point of view.

  I think you're aware that in principle
new code is expected to use xvmalloc() and friends unless there are specific
reasons speaking against that.

Oh, missed 'v'...

... adding the missing 'v' will take care of my concern. Provided of
course this isn't running to early for vmalloc() to be usable just yet.

+    if ( !imsic_cfg.msi )
+    {
+        rc = -ENOMEM;
+        goto imsic_init_err;
+    }
+
+    /* Check MMIO register sets */
+    for ( unsigned int i = 0; i < nr_mmios; i++ )
+    {
+        if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
+        {
+            rc = -ENOMEM;
+            goto imsic_init_err;
+        }
+
+        rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
+                                   &imsic_cfg.mmios[i].size);
+        if ( rc )
+        {
+            printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
+                   node->name, i);
+            goto imsic_init_err;
+        }
+
+        base_addr = imsic_cfg.mmios[i].base_addr;
+        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
+                           imsic_cfg.hart_index_bits +
+                           IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
+        base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
+                       imsic_cfg.group_index_shift);
+        if ( base_addr != imsic_cfg.base_addr )
+        {
+            rc = -EINVAL;
+            printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
+                   node->name, i);
+            goto imsic_init_err;
+        }

Maybe just for my own understanding: There's no similar check for the
sizes to match / be consistent wanted / needed?

If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
provide, IMO.
So I don't what is possible range for imsic_cfg.mmios[i].size.

Well, all I can say is that's it feels odd that you sanity check base_addr
but permit effectively any size.

Okay, I think I have two ideas how to check the size: 1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as: for (socket = 0; socket < socket_count; socket++) { imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE; imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) * s->soc[socket].num_harts; ... where: #define IMSIC_MMIO_PAGE_SHIFT 12 #define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT) #define IMSIC_HART_NUM_GUESTS(__guest_bits) \ (1U << (__guest_bits)) #define IMSIC_HART_SIZE(__guest_bits) \ (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ) 2. Just take a theoretical maximum for S-mode IMSIC's node: 16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB Where, 16,384 - maximum possible amount of harts according to AIA spec 64 - a maximum amount of possible interrupt file for S-mode IMSIC node: 1 - S interupt file + 63 guest interrupt files. 4 Kib - a maximum size of one interrupt file.

Which option is preferred?

The specification doesn’t seem to mention (or I couldn’t find) that all platforms must calculate the MMIO size in the same way QEMU does. Therefore, it’s probably better to use the approach described in option 2.

On the other hand, I don't think a platform should be considered correct if it provides slightly more than needed but still less than the theoretical maximum.

@@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
      return pcpu_info[cpuid].hart_id;
  }
  
+static inline unsigned long hartid_to_cpuid(unsigned long hartid)
+{
+    for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
+    {
+        if ( hartid == cpuid_to_hartid(cpuid) )
+            return cpuid;
+    }
+
+    /* hartid isn't valid for some reason */
+    return NR_CPUS;
+}

Considering the values being returned, why's the function's return type
"unsigned long"?

mhartid register has MXLEN length, so theoretically we could have from 0 to MXLEN-1
Harts and so we could have the same amount of Xen's CPUIDs. and MXLEN is 32 for RV32
and MXLEN is 64 for RV64.

Yet the return value here is always constrained by NR_CPUS, isn't it?

I am not 100% sure that I get your point, but I put NR_CPUS here because of: /* * tp points to one of these per cpu. * * hart_id would be valid (no matter which value) if its * processor_id field is valid (less than NR_CPUS). */ struct pcpu_info pcpu_info[NR_CPUS] = { [0 ... NR_CPUS - 1] = { .processor_id = NR_CPUS, }};

As an option we could use ULONG_MAX. Would it be better? ~ Oleksii

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.