[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v1] xen/sched/null: avoid crash after failed domU creation


  • To: <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Stewart Hildebrand <stewart.hildebrand@xxxxxxx>
  • Date: Mon, 1 May 2023 16:30:46 -0400
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=lists.xenproject.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CSFUIv+vA4xvYLSDbnANA0FJUUKynik5M6lLpry0LHk=; b=Wtg8XHCEFG0n5SqdcJqZqvWykMifHxVRV8SzZ2zrQnrZ1rgOMJLeyOZmVxhk98oWI7HSmsAkaRpNjogWAwshBJgiDQRS6n2o8YyxljzVzLJaVuw7h1LqgnQ2nfctTxMcuJOHLVDdP8UpPs0881sIvUZpd5q6nGa3BoDyLxFy4yDo/rBWjqlWsEE/gS0Z93Iu7ftVBVYh4SttaJFB19LJpuhjZOUcUbGpexwqGJJ1qUGvSruhSXxcZpj+AQrUZtFXjDzcx/oMhxP3hR87Tp1aYeraWnlO+5igDc2o/mvS/GWPZhf3diisk7jSYmnK+Jn/til7tVSYhCpwhipAau+bEw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mdMwMeBAzJ0YwgoD90WE9JrbR91We36xi7QeZHVwiTKiWY7A5Khhoc6nxeuWihek56vzUcJ4DYkKLLzo+5NjJzHfrKOBzkwl/OZCzsEo69HEpT7pOMq8zsGZq7G/QMStnhSt0IIZsYOqxvyH93aiZ9uYxH+vTry9AV0SzEpYpnOM4Sw5YN7vFcPeATrl5XHbglJP6134JQJB1/Un9urouybQ+u4i/59HhF3U0GKL0Ladsck8qt4oS8/nSIt/DsxwGICZPvkIevoZntQskET84Yi/uMHDRpnwZZoY3KKmYKunJ0H6Lt35aAwg15DNw4I65z2vW/zLWO1x7pzihh+AbQ==
  • Cc: Stewart Hildebrand <stewart.hildebrand@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, "Juergen Gross" <jgross@xxxxxxxx>
  • Delivery-date: Mon, 01 May 2023 20:31:13 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

When creating a domU, but the creation fails, there is a corner case that may
lead to a crash in the null scheduler when running a debug build of Xen.

(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'npc->unit == unit' failed at common/sched/null.c:379
(XEN) ****************************************

The events leading to the crash are:

* null_unit_insert() was invoked with the unit offline. Since the unit was
  offline, unit_assign() was not called, and null_unit_insert() returned.
* Later during domain creation, the unit was onlined
* Eventually, domain creation failed due to bad configuration
* null_unit_remove() was invoked with the unit still online. Since the unit was
  online, it called unit_deassign() and triggered an ASSERT.

To fix this, only call unit_deassign() when npc->unit is non-NULL in
null_unit_remove.

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@xxxxxxx>
---
RFC->v1
* Follow Juergen's suggested fix

Link to RFC [1]

[1] https://lists.xenproject.org/archives/html/xen-devel/2023-04/msg01387.html
---
 xen/common/sched/null.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/xen/common/sched/null.c b/xen/common/sched/null.c
index 65a0a6c5312d..2091337fcd06 100644
--- a/xen/common/sched/null.c
+++ b/xen/common/sched/null.c
@@ -522,6 +522,8 @@ static void cf_check null_unit_remove(
 {
     struct null_private *prv = null_priv(ops);
     struct null_unit *nvc = null_unit(unit);
+    struct null_pcpu *npc;
+    unsigned int cpu;
     spinlock_t *lock;
 
     ASSERT(!is_idle_unit(unit));
@@ -531,8 +533,6 @@ static void cf_check null_unit_remove(
     /* If offline, the unit shouldn't be assigned, nor in the waitqueue */
     if ( unlikely(!is_unit_online(unit)) )
     {
-        struct null_pcpu *npc;
-
         npc = unit->res->sched_priv;
         ASSERT(npc->unit != unit);
         ASSERT(list_empty(&nvc->waitq_elem));
@@ -549,7 +549,10 @@ static void cf_check null_unit_remove(
         goto out;
     }
 
-    unit_deassign(prv, unit);
+    cpu = sched_unit_master(unit);
+    npc = get_sched_res(cpu)->sched_priv;
+    if ( npc->unit )
+        unit_deassign(prv, unit);
 
  out:
     unit_schedule_unlock_irq(lock, unit);
-- 
2.40.1




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.