[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Xen panic when shutting down ARINC653 cpupool


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: "Choi, Anderson" <Anderson.Choi@xxxxxxxxxx>
  • Date: Mon, 17 Mar 2025 05:07:38 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=boeing.com; dmarc=pass action=none header.from=boeing.com; dkim=pass header.d=boeing.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector5401; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Z7IUTuOwfQpuS/VQLgUi8jPOyLtavJ0bguvSIR4AbeE=; b=wmfgRiUsqeUy6WOJJA/Sw2fWC4RPvit8iuOr55chMv+mHzPJG8gqRoAR+CBNawaOmu7Xojzwka7paPd5RXa4LDHhLqEbQj2yruQwbhQxf7EAFej6VtNVqf9IeNYK2U0cWNwDfHpGE24y7WeoHn+s2esPlRxEWIebEZFgsymntM669CkwbhFA5Bk3YtcOtKJJBggKsnetX06L90nnTeJCYXR2xTyiMtKj2W0ULzqjWK0zL+W5zpAMuCcIIyK7qVxiWtD1JqsRefgWYzGc6Tg6ML7VgqbAujIX94ClqaA7VC1/KdcW2GGKQ4FuWeCnH6D4ZhxXAd7DV5FS5X5T+J6V/g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector5401; d=microsoft.com; cv=none; b=ZmwUxdG7NvKkID1MTHcM+0GsB1VS5fzAnsW2J92A35PTzb6mdPb30VgIGuIs896FT2knOhA8tN1eZmR2F6hbXS2zwf5IIa/05JoJlRCoRi4jdHPe9TrzVQwtBsmWSkFCsBvMc1P+j+6UDdcyUT4DbxlCH+bQrd5hyQykqNQDs5pE8CNs4v2lvJnIp1rfYbJJ82MQ2oWKU+SL5XeUfbLFmYyi5uXvKnCCAM8f8Zqzhz4py/AVod/iiw43E2Yk95tHF7zIh3QgDs0WJuWjH+79sfrNtbWfw5wcogQ9+T7oN0n2bpyZkJsPIqmj9Tplyb8IL5NVE5Cm5VBaRwNR8CVE5Q==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=boeing.com;
  • Cc: "nathan.studer@xxxxxxxxxxxxxxx" <nathan.studer@xxxxxxxxxxxxxxx>, "stewart@xxxxxxx" <stewart@xxxxxxx>, "Weber (US), Matthew L" <matthew.l.weber3@xxxxxxxxxx>, "Whitehead (US), Joshua C" <joshua.c.whitehead@xxxxxxxxxx>
  • Delivery-date: Mon, 17 Mar 2025 05:08:21 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AduW+TkPYPhh1TXsRdStDrK0/zWwnA==
  • Thread-topic: Xen panic when shutting down ARINC653 cpupool

I'd like to report xen panic when shutting down an ARINC653 domain with the 
following setup.
Note that this is only observed when CONFIG_DEBUG is enabled.

[Test environment]
Yocto release : 5.05
Xen release : 4.19 (hash = 026c9fa29716b0ff0f8b7c687908e71ba29cf239)
Target machine : QEMU ARM64
Number of physical CPUs : 4

[Xen config]
CONFIG_DEBUG = y

[CPU pool configuration files]
cpupool_arinc0.cfg
- name= "Pool-arinc0"
- sched="arinc653"
- cpus=["2"]

[Domain configuration file]
dom1.cfg
- vcpus = 1
- pool = "Pool-arinc0"

$ xl cpupool-cpu-remove Pool-0 2
$ xl cpupool-create -f cpupool_arinc0.cfg
$ xl create dom1.cfg
$ a653_sched -P Pool-arinc0 dom1:100

** Wait for DOM1 to complete boot.**

$ xl shutdown dom1

[xen log]
root@boeing-linux-ref:~# xl shutdown dom1
Shutting down domain 1
root@boeing-linux-ref:~# (XEN) Assertion '!in_irq() && (local_irq_is_enabled() 
|| num_online_cpus() <= 1)' failed at common/xmalloc_tlsf.c:714
(XEN) ----[ Xen-4.19.1-pre  arm64  debug=y  Tainted: I      ]----
(XEN) CPU:    2
(XEN) PC:     00000a000022d2b0 xfree+0x130/0x1a4
(XEN) LR:     00000a000022d2a4
(XEN) SP:     00008000fff77b50
(XEN) CPSR:   00000000200002c9 MODE:64-bit EL2h (Hypervisor, handler)
...
(XEN) Xen call trace:
(XEN)    [<00000a000022d2b0>] xfree+0x130/0x1a4 (PC)
(XEN)    [<00000a000022d2a4>] xfree+0x124/0x1a4 (LR)
(XEN)    [<00000a00002321f0>] arinc653.c#a653sched_free_udata+0x50/0xc4
(XEN)    [<00000a0000241bc0>] core.c#sched_move_domain_cleanup+0x5c/0x80
(XEN)    [<00000a0000245328>] sched_move_domain+0x69c/0x70c
(XEN)    [<00000a000022f840>] cpupool.c#cpupool_move_domain_locked+0x38/0x70
(XEN)    [<00000a0000230f20>] cpupool_move_domain+0x34/0x54
(XEN)    [<00000a0000206c40>] domain_kill+0xc0/0x15c
(XEN)    [<00000a000022e0d4>] do_domctl+0x904/0x12ec
(XEN)    [<00000a0000277a1c>] traps.c#do_trap_hypercall+0x1f4/0x288
(XEN)    [<00000a0000279018>] do_trap_guest_sync+0x448/0x63c
(XEN)    [<00000a0000262c80>] entry.o#guest_sync_slowpath+0xa8/0xd8
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 2:
(XEN) Assertion '!in_irq() && (local_irq_is_enabled() || num_online_cpus() <= 
1)' failed at common/xmalloc_tlsf.c:714
(XEN) ****************************************

In commit 19049f8d (sched: fix locking in a653sched_free_vdata()), locking was 
introduced to prevent a race against the list manipulation but leads to 
assertion failure when the ARINC 653 domain is shutdown.

I think this can be fixed by calling xfree() after spin_unlock_irqrestore() as 
shown below.

xen/common/sched/arinc653.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
index 7bf288264c..1615f1bc46 100644
--- a/xen/common/sched/arinc653.c
+++ b/xen/common/sched/arinc653.c
@@ -463,10 +463,11 @@ a653sched_free_udata(const struct scheduler *ops, void 
*priv)
     if ( !is_idle_unit(av->unit) )
         list_del(&av->list);

-    xfree(av);
     update_schedule_units(ops);

     spin_unlock_irqrestore(&sched_priv->lock, flags);
+
+    xfree(av);
 }

Can I hear your opinion on this?

Regards,
Anderson



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.