[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/dom0less: Increase guest DTB size for high-vCPU guests


  • To: Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Grygorii Strashko <grygorii_strashko@xxxxxxxx>
  • Date: Wed, 3 Dec 2025 16:32:44 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=BXK1X0YWQIWeNf42q7rA/LtFNPCr+yzLUG2W8AhPii0=; b=sukNRJbjIa+a8Ur1qwcJFjndhgrjPf9NoI51Sty1T+PUh8XpgDwb/E6RmQZbVaOOfMOvoobQcmCJ16hzFsSLGVFhQFN+z5Dvpn5i/Z3djCLNHJD9O30TyKYq8TOMGh5HjY3K6W7ejckGEpPI3mBqlR1axTIs9PymJNf3yR/QIMflDWt/7/xhG8LYIdgaZJz9TNYP2JpkKxYT2EJx3xaTHwWWAWzwekJJ0fuG3IgJIc2mv1eL73dbABALPTMJm2Iqu5bpfHGPzTRBVoBZX+13WR4USm8gbJsi5CWBdnIStwuEtCn8Jld2SdyXtuoyFp1dgoSJultFnSBERHcjhUKHJw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ynUPmaRP3eGF//xduvN4law3V4R+IPrjj8g+n2fZ/i4kaqOb9664Q38yO7VA89e5TEZ4CapyTlSN9SWgdC7nXA6Sk3Erl1U8ayjF1u6qSydQbF9QWpx70iTVXubFRhsOBNS2bCAzqyTIOvsovd6+HAAqsY0sSRN/4MeYd44OVj+p9ciOTGpNLARGjB7jEXWOskSoordUBUcSxpuykc38PaIyMPmqjk7LG7ex1toePjXtvjRZm73GUfwLKhcO54ijxsKGnrPSxg3oa+jUNL+4QtzBWllBErh3EnMA54iWIIWWxwf51Ij52hE9ojniDHjdWFnn3UO0Qmo1zs77+MrBUA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=epam.com;
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>
  • Delivery-date: Wed, 03 Dec 2025 14:33:06 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Oleksandr,

On 03.12.25 13:03, Oleksandr Tyshchenko wrote:


On 02.12.25 23:33, Grygorii Strashko wrote:


Hello Grygorii



On 02.12.25 21:32, Oleksandr Tyshchenko wrote:
Creating a guest with a high vCPU count (e.g., >32) fails because
the guest's device tree buffer (DOMU_DTB_SIZE) overflows during creation.
The FDT nodes for each vCPU quickly exhaust the 4KiB buffer,
causing a guest creation failure.

Increase the buffer size to 16KiB to support guests up to
the MAX_VIRT_CPUS limit (128).

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
---
Noticed when testing the boundary conditions for dom0less guest
creation on Arm64.

Domain configuration:
fdt mknod /chosen domU0
fdt set /chosen/domU0 compatible "xen,domain"
fdt set /chosen/domU0 \#address-cells <0x2>
fdt set /chosen/domU0 \#size-cells <0x2>
fdt set /chosen/domU0 memory <0x0 0x10000 >
fdt set /chosen/domU0 cpus <33>
fdt set /chosen/domU0 vpl011
fdt mknod /chosen/domU0 module@40400000
fdt set /chosen/domU0/module@40400000 compatible  "multiboot,kernel"
"multiboot,module"
fdt set /chosen/domU0/module@40400000 reg <0x0 0x40400000 0x0 0x16000 >
fdt set /chosen/domU0/module@40400000 bootargs "console=ttyAMA0"

Failure log:
(XEN) Xen dom0less mode detected
(XEN) *** LOADING DOMU cpus=33 memory=0x10000KB ***
(XEN) Loading d1 kernel from boot module @ 0000000040400000
(XEN) Allocating mappings totalling 64MB for d1:
(XEN) d1 BANK[0] 0x00000040000000-0x00000044000000 (64MB)
(XEN) Device tree generation failed (-22).
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Could not set up domain domU0 (rc = -22)
(XEN) ****************************************
---
---
   xen/common/device-tree/dom0less-build.c | 8 +++++---
   1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/xen/common/device-tree/dom0less-build.c b/xen/common/
device-tree/dom0less-build.c
index 3f5b987ed8..d7d0a47b97 100644
--- a/xen/common/device-tree/dom0less-build.c
+++ b/xen/common/device-tree/dom0less-build.c
@@ -461,10 +461,12 @@ static int __init
domain_handle_dtb_boot_module(struct domain *d,
   /*
    * The max size for DT is 2MB. However, the generated DT is small
(not including
- * domU passthrough DT nodes whose size we account separately), 4KB
are enough
- * for now, but we might have to increase it in the future.
+ * domU passthrough DT nodes whose size we account separately). The
size is
+ * primarily driven by the number of vCPU nodes. The previous 4KiB
buffer was
+ * insufficient for guests with high vCPU counts, so it has been
increased
+ * to support up to the MAX_VIRT_CPUS limit (128).
    */
-#define DOMU_DTB_SIZE 4096
+#define DOMU_DTB_SIZE (4096 * 4)

May be It wants Kconfig?
Or some formula which accounts MAX_VIRT_CPUS?


I agree that using a formula that accounts for MAX_VIRT_CPUS is the most
robust approach.

Here is the empirical data (by testing with the maximum number of device
tree nodes (e.g., hypervisor and reserved-memory nodes) and enabling all
optional CPU properties (e.g., clock-frequency)):

cpus=1
(XEN) Final compacted FDT size is: 1586 bytes

cpus=2
(XEN) Final compacted FDT size is: 1698 bytes

cpus=32
(XEN) Final compacted FDT size is: 5058 bytes

cpus=128
(XEN) Final compacted FDT size is: 15810 bytes


static int __init prepare_dtb_domU(struct domain *d, struct kernel_info
*kinfo)
   {
       int addrcells, sizecells;
@@ -569,6 +569,8 @@ static int __init prepare_dtb_domU(struct domain *d,
struct kernel_info *kinfo)
       if ( ret < 0 )
           goto err;

+    printk("Final compacted FDT size is: %d bytes\n",
fdt_totalsize(kinfo->fdt));
+
       return 0;

     err:

This data shows (assuming my testing/calculations are correct):

- A marginal cost of 112 bytes per vCPU in the final, compacted device tree.
- A fixed base size of 1474 bytes for all non-vCPU content.

Thank for detailed analyses and info.


Based on that I would propose the following formula with the justification:

/*
   * The size is calculated from a fixed baseline plus a scalable
   * portion for each potential vCPU node up to the system limit
   * (MAX_VIRT_CPUS), as the vCPU nodes are the primary consumer
   * of space.
   *
   * The baseline of 2KiB is a safe buffer for all non-vCPU FDT
   * content. The 128 bytes per vCPU is derived from a worst-case
   * analysis of the FDT construction-time size for a single
   * vCPU node.
   */
#define DOMU_DTB_SIZE (2048 + (MAX_VIRT_CPUS * 128))

**********************************************

Please tell me would you be happy with that?

It looks ok. One thing I worry about - should it be Xen page aligned?

--
Best regards,
-grygorii




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.