[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] xen/sched: rtds: re-arm repl_timer after timer re-initialization


  • To: Jürgen Groß <jgross@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Oleksii Moisieiev <oleksii_moisieiev@xxxxxxxx>
  • Date: Mon, 4 May 2026 16:27:02 +0300
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0pO07kbOLrBNf+2yh/pHl7wgNCy7tM4/oLXNW8eAzfo=; b=HnrktoUAKy6Ri1HhdJA1JwkkgqMwd+5iSps456MVgXJEU96s0WIQqGZ+4JA0dlK070CqbnZKV7yTZD4j5QrfTnVb7702mJejK2f+aCzvxi/xygpqYQ0G+Xn2fmt5a32xgoFcHxJ218rlTULyZUa7CU30g9juoJnb/by3czsb+C4Z8faKfEX/QSouAt4Zoxzx2rXJ+YFaL81CvtkWccDw99NAbbnYIRV6AbxALPQczRKMZqdCjYJ5SZdwMBXk9vnj6EF9x8yTQIeR0kGcjh0ZnJTFuo0Wfb76TWscHHxCERJ/HCg7qElbStew4nr+ESMf1HIUrIPD2RO06vIK6mi3dg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=eSdQmT42pIeuHixXFF/RQ9AdONHNB/TMmsww45CThuduq5CS9+lCSf/31ZmNjs2T1LISVgiQktBvS6E6UfW2EKcTZL0Tdlp0ZZTULWg6D+1atPOBFusYmf1FPPQJcBwiU3taheLbYzDqu7ZwARaCboFkiraUvLhBRlXsGpRfMooMAstI140XXUpGP1AYOS18MaIhyOx9fyEUQ3fDonH+k+asdjp+qM4c+vyfcdc3zyqfb2d3k/yvxTs5TrJfJARKyJFx2MoNYliPKryD4Das04Gu0IEQPirxUjvljKOHdwf5eZp8aSnYkOXWaQkrnDZ/NWEMIlFbS+F1ZuKMavVWTQ==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=epam.com header.i="@epam.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=epam.com;
  • Cc: Dario Faggioli <dfaggioli@xxxxxxxx>, George Dunlap <gwd@xxxxxxxxxxxxxx>, Meng Xu <mengxu@xxxxxxxxxxxxx>
  • Delivery-date: Mon, 04 May 2026 13:27:33 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


On 10/04/2026 18:19, Jürgen Groß wrote:
On 10.04.26 17:12, Oleksii Moisieiev wrote:

On 10/04/2026 15:16, Juergen Gross wrote:
On 10.04.26 14:13, Juergen Gross wrote:
On 10.04.26 14:04, Oleksii Moisieiev wrote:
Hi Juergen,

During our safety certification analysis work, we identified this as a potential issue. While we haven't encountered this problem in practice yet, it could occur
in the future, so I believe it should be addressed proactively.

For being able to occur in future, the handling of removing a cpu from a cpupool would need to be changed. Considering the refusal to remove the
last cpu from a populated cpupool is on purpose (this avoids leaving a
domain without any cpu to run on), adding the code as you suggest would
just be an addition without any benefit.

It isn't doing any harm (other than adding code without purpose), so I
won't explicitly NAK the patch, but I won't Ack it either.

One further remark: I would ack the addition of an ASSERT(list_empty(replq))
instead of the conditional set_timer() call.

You're right: with the current cpupool semantics, when the timer is re- initialized in this path, replq is expected to be empty. In that case there is nothing to re-arm, and the timer can be programmed later when a new replenishment event is queued.

Now I see that it would probably be better to update the cpupool logic to prohibit removing the last pCPU from a cpupool. In that case, this fix — even with the ASSERT — seems to be no longer relevant.

I think I'd rather post an update for the cpupool semantics and drop this patch. Or I can send a v3 with the ASSERT if you think that is still reasonable.

The cpupool semantics are already existing. I have written it this way when I
introduced cpupools.


Juergen

Hi Juergen,

You're right, thanks for the pointer. I went back and re-checked
cpupool_unassign_cpu_start() in xen/common/sched/cpupool.c and the guard
is indeed already there: when n_dom > 0 and the cpu being removed is the
last one in c->cpu_valid, all domains are moved to cpupool0 first, and
the call returns -EBUSY if any domain is still alive while the system is
active. So by the time the last RTDS pCPU is actually unassigned, no
units remain in the pool and replq is guaranteed empty when
rt_switch_sched() later re-initializes repl_timer.

That makes the conditional set_timer() in v2 unreachable under the
current (and intended) cpupool semantics, so there is no need to touch
cpupool either.

I'll send a v3 that replaces the conditional set_timer() with
ASSERT(list_empty(replq))

--

Oleksii




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.