[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation




On 6/2/25 12:21 PM, Jan Beulich wrote:
On 26.05.2025 20:44, Oleksii Kurochko wrote:
On 5/22/25 4:46 PM, Jan Beulich wrote:
On 21.05.2025 18:03, Oleksii Kurochko wrote:
+    /* Allocate MMIO resource array */
+    imsic_cfg.mmios = xzalloc_array(struct imsic_mmios, nr_mmios);
How large can this and ...

+    if ( !imsic_cfg.mmios )
+    {
+        rc = -ENOMEM;
+        goto imsic_init_err;
+    }
+
+    imsic_cfg.msi = xzalloc_array(struct imsic_msi, nr_parent_irqs);
... this array grow (in principle)?
Roughly speaking, this is the number of processors. The highests amount of processors
on the market I saw it was 32. But it was over a year ago when I last checked this.
Unless there's an architectural limit, I don't think it's a good idea to
take as reference what's available at present. But yes, ...
This (32) is not an architectural limit.
I assume that if mhartd id accepts a range from 0 to  2^64-1 for RV64 then I assume
that the *theoretical* limit for amount of cpus is  2^64-1. And in RISC-V spec. I can't
find if it is theoretical limit or not.
But if look into AIA (interrupt controller) specification then it tells explicitly that limit
is 16,384:
  1.2 Limits
  In its current version, the RISC-V Advanced Interrupt Architecture can support RISC-V symmet-ric
  multiprocessing (SMP) systems with up to 16,384 harts. If the harts are 64-bit (RV64) and implement
  the hypervisor extension, and if all features of the Advanced Interrupt Architecture are fully
  implemented as well, then for each physical hart there may be up to 63 active virtual harts and
  potentially thousands of additional idle (swapped-out) virtual harts, where each virtual hart has
  direct control of one or more physical devices.
Also 16,384 is used as a maximum for nr_parent_irqs from DTS point of view.


  I think you're aware that in principle
new code is expected to use xvmalloc() and friends unless there are specific
reasons speaking against that.
Oh, missed 'v'...
... adding the missing 'v' will take care of my concern. Provided of
course this isn't running to early for vmalloc() to be usable just yet.

+    if ( !imsic_cfg.msi )
+    {
+        rc = -ENOMEM;
+        goto imsic_init_err;
+    }
+
+    /* Check MMIO register sets */
+    for ( unsigned int i = 0; i < nr_mmios; i++ )
+    {
+        if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
+        {
+            rc = -ENOMEM;
+            goto imsic_init_err;
+        }
+
+        rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
+                                   &imsic_cfg.mmios[i].size);
+        if ( rc )
+        {
+            printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
+                   node->name, i);
+            goto imsic_init_err;
+        }
+
+        base_addr = imsic_cfg.mmios[i].base_addr;
+        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
+                           imsic_cfg.hart_index_bits +
+                           IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
+        base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
+                       imsic_cfg.group_index_shift);
+        if ( base_addr != imsic_cfg.base_addr )
+        {
+            rc = -EINVAL;
+            printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
+                   node->name, i);
+            goto imsic_init_err;
+        }
Maybe just for my own understanding: There's no similar check for the
sizes to match / be consistent wanted / needed?
If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
provide, IMO.
So I don't what is possible range for imsic_cfg.mmios[i].size.
Well, all I can say is that's it feels odd that you sanity check base_addr
but permit effectively any size.
Okay, I think I have two ideas how to check the size:
1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
    for (socket = 0; socket < socket_count; socket++) {
        imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
        imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
                     s->soc[socket].num_harts;
    ...
   where:
     #define IMSIC_MMIO_PAGE_SHIFT          12
     #define IMSIC_MMIO_PAGE_SZ             (1UL << IMSIC_MMIO_PAGE_SHIFT)
     
     #define IMSIC_HART_NUM_GUESTS(__guest_bits)           \
             (1U << (__guest_bits))
     #define IMSIC_HART_SIZE(__guest_bits)                 \
             (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)

2. Just take a theoretical maximum for S-mode IMSIC's node:
    16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
   Where,
     16,384 - maximum possible amount of harts according to AIA spec
     64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
          1 - S interupt file + 63 guest interrupt files.
     4 Kib - a maximum size of one interrupt file.

Which option is preferred?
The specification doesn’t seem to mention (or I couldn’t find) that all platforms
must calculate the MMIO size in the same way QEMU does. Therefore, it’s probably
better to use the approach described in option 2.
On the other hand, I don't think a platform should be considered correct if it
provides slightly more than needed but still less than the theoretical maximum.


@@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
      return pcpu_info[cpuid].hart_id;
  }
  
+static inline unsigned long hartid_to_cpuid(unsigned long hartid)
+{
+    for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
+    {
+        if ( hartid == cpuid_to_hartid(cpuid) )
+            return cpuid;
+    }
+
+    /* hartid isn't valid for some reason */
+    return NR_CPUS;
+}
Considering the values being returned, why's the function's return type
"unsigned long"?
mhartid register has MXLEN length, so theoretically we could have from 0 to MXLEN-1
Harts and so we could have the same amount of Xen's CPUIDs. and MXLEN is 32 for RV32
and MXLEN is 64 for RV64.
Yet the return value here is always constrained by NR_CPUS, isn't it?
I am not 100% sure that I get your point, but I put NR_CPUS here because of:
  /*
   * tp points to one of these per cpu.
   *
   * hart_id would be valid (no matter which value) if its
   * processor_id field is valid (less than NR_CPUS).
   */
  struct pcpu_info pcpu_info[NR_CPUS] = { [0 ... NR_CPUS - 1] = {
      .processor_id = NR_CPUS,
  }};
As an option we could use ULONG_MAX. Would it be better?

~ Oleksii

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.