Xen project Mailing List

Re: [PATCH for-4.21 01/10] x86/HPET: limit channel changes

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Thu, 16 Oct 2025 17:07:03 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Bl469f79AQB426gIggW5HVGm2UDkvhKOGqmmQIlWQf8=; b=Bgp7fYc+YWS316GMCJK4LpFVDvGiEAOfTAaxclDAiiqjXjOizGzQtxhWihmX2e7jYhRoLzwb9eRZZioA7Wr6BJpI4u0EYcVZ58z18qM9oYjJzpscb/9qBD1JD0UuTBx3fKX2tpQF0LEMbpAYDfA0tckKjj93YOQz6TLFaPDXHEO/P1GLiffCzOhWSqweWeAEbfFtMIJ/zThyCfCAM8OtYsOViSTF3WoEFgDKGFkZ4sBXneJZxj9yauHaVMLSd8vdLaE/RpdCs5Zw2ngQzaDc2rP2fBJrH0LPL/9ppm1uAx2OECtQhPaqQIOf3s0XrM5fRqslmanviszcyQ4zdN/jIw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=iwj0iq8Nr9CdV1weKZq44RVL2rdHCT+/B/a98tXyaykIOF6bgZ8vQXzvCGKXRi789U+ntWBjbnWEC18Pp7tdb/qTdfVKIb2xfS63rDbvNNJbyFp36ze1jy2DcNwlqqTTUSfhTbNExEv4/06yfMFf7/2HZuwAb9JZ7GMhAK9JFm5p3xfjobFRYxWBymRJz5LMWceWmzYSsB59JjaPTT5GOXoL9/10wFS0EJ30YpR8+Wp2h2u/GLQN1SXeKm59K4KLWlQsuqNK8QVWgcDsvf7X/TrG0ACToFwLFpAhU+TD5CSpL86WDaJrX5NC/6Mk3NntnFEm2wr1mvKa0fKzm1PxCQ==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>

Delivery-date: Thu, 16 Oct 2025 15:07:21 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Oct 16, 2025 at 01:47:38PM +0200, Jan Beulich wrote: > On 16.10.2025 12:24, Roger Pau Monné wrote: > > On Thu, Oct 16, 2025 at 09:31:21AM +0200, Jan Beulich wrote: > >> Despite 1db7829e5657 ("x86/hpet: do local APIC EOI after interrupt > >> processing") we can still observe nested invocations of > >> hpet_interrupt_handler(). This is, afaict, a result of previously used > >> channels retaining their IRQ affinity until some other CPU re-uses them. > > > > But the underlying problem here is not so much the affinity itself, > > but the fact that the channel is not stopped after firing? > > (when being detached, that is) That's the main problem here, yes. A minor > benefit is to avoid the MMIO write in hpet_msi_set_affinity(). See also > below. > > Further, even when mask while detaching, the issue would re-surface after > unmasking; it's just that the window then is smaller. Yeah, it could trigger after unmasking, but the window is smaller there, as after enabling the comparator will get updated to the new deadline. > >> @@ -454,9 +456,21 @@ static struct hpet_event_channel *hpet_g > >> if ( num_hpets_used >= nr_cpu_ids ) > >> return &hpet_events[cpu]; > >> > >> + /* > >> + * Try the least recently used channel first. It may still have its > >> IRQ's > >> + * affinity set to the desired CPU. This way we also limit having > >> multiple > >> + * of our IRQs raised on the same CPU, in possibly a nested manner. > >> + */ > >> + ch = per_cpu(lru_channel, cpu); > >> + if ( ch && !test_and_set_bit(HPET_EVT_USED_BIT, &ch->flags) ) > >> + { > >> + ch->cpu = cpu; > >> + return ch; > >> + } > >> + > >> + /* Then look for an unused channel. */ > >> next = arch_fetch_and_add(&next_channel, 1) % num_hpets_used; > >> > >> - /* try unused channel first */ > >> for ( i = next; i < next + num_hpets_used; i++ ) > >> { > >> ch = &hpet_events[i % num_hpets_used]; > >> @@ -479,6 +493,8 @@ static void set_channel_irq_affinity(str > >> { > >> struct irq_desc *desc = irq_to_desc(ch->msi.irq); > >> > >> + per_cpu(lru_channel, ch->cpu) = ch; > >> + > >> ASSERT(!local_irq_is_enabled()); > >> spin_lock(&desc->lock); > >> hpet_msi_mask(desc); > > > > Maybe I'm missing the point here, but you are resetting the MSI > > affinity anyway here, so there isn't much point in attempting to > > re-use the same channel when Xen still unconditionally goes through the > > process of setting the affinity anyway? > > While still using normal IRQs, there's still a benefit: We can re-use the > same vector (as staying on the same CPU), and hence we save an IRQ > migration (being the main source of nested IRQs according to my > observations). Hm, I see. You short-circuit all the logic in _assign_irq_vector(). > We could actually do even better, by avoiding the mask/unmask pair there, > which would avoid triggering the "immediate" IRQ that I (for now) see as > the only explanation of the large amount of "early" IRQs that I observe > on (at least) Intel hardware. That would require doing the msg.dest32 > check earlier, but otherwise looks feasible. (Actually, the unmask would > still be necessary, in case we're called with the channel already masked.) Checking with .dest32 seems a bit crude, I would possibly prefer to slightly modify hpet_attach_channel() to notice when ch->cpu == cpu and avoid the call to set_channel_irq_affinity()? Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.