[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.21 01/10] x86/HPET: limit channel changes


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 16 Oct 2025 12:24:35 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=o5fHwBFjrWn83tH4JkAS2hRvpoUeBXwp6gWXDKzExCs=; b=V1V8cBvaRs8fXFcZWhqajZT+9on1+ICVcJZsEYoPxn9jOsn1hDEjpBe/NHARFVu8l9lVIvl3YXPmuVefrL6qyNSui4T0iKBZ3XnCUtRPeIz2uA6FhzTNlDqtiYv5ZaCRObjdxt5tdedFQfKVlGH8UzMw+SXEWGpoG//pmN5Rtp+eMC+W/jvbg6JncLjKcx9n3DMdSeZUIVMQHfTb69lrOHlYgUZGC4oKy3Jub4ej2JNPvNqRpxK7Ec5GpRvncL87BsfGMdrfJyH22BZdydJ+oTrgBRXhxkdAYmW92UECpyI2vvdebVIfFB5ekXQFzmyaag9ONzkpMgxhaDyLG2v0GA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=skqNMhXsqQDTtsGsmzB66XuJzskowFH6DfzhnihMjdvdM5W73QT2MI9HIbKQHryL7qGBDjNu4IiJ9937mSvirnrkCK7Ub7Hwnes6xsmtY60BU0SnAiExLqykXE1XKXW/LAOHKzGvmqgnjxmeM50PPs48u81NvUfo8y0JScihyy3Gos2zGsmHtX0onwEoFG0co+2A4b3x+68qGIeK7JTfHk6fU5bScyWBLIctXnxNdGrGiRDbblaCKiBlclc7Bot8MzA48rzjZV1FCvUM9dS3ykxPq22vdJpx3/KcCaL6pFEk//R7enFYUmP7nAE7GzU6cVSZZYiWlr7iDHqDFgEWfg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
  • Delivery-date: Thu, 16 Oct 2025 10:24:48 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Oct 16, 2025 at 09:31:21AM +0200, Jan Beulich wrote:
> Despite 1db7829e5657 ("x86/hpet: do local APIC EOI after interrupt
> processing") we can still observe nested invocations of
> hpet_interrupt_handler(). This is, afaict, a result of previously used
> channels retaining their IRQ affinity until some other CPU re-uses them.

But the underlying problem here is not so much the affinity itself,
but the fact that the channel is not stopped after firing?

> Such nesting is increasingly problematic with higher CPU counts, as both
> handle_hpet_broadcast() and cpumask_raise_softirq() have a cpumask_t local
> variable. IOW already a single level of nesting may require more stack
> space (2 times above 4k) than we have available (8k), when NR_CPUS=16383
> (the maximum value presently possible).
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> Whether this is still worthwhile with "x86/HPET: use single, global, low-
> priority vector for broadcast IRQ" isn't quite clear to me.
> 
> --- a/xen/arch/x86/hpet.c
> +++ b/xen/arch/x86/hpet.c
> @@ -442,6 +442,8 @@ static void __init hpet_fsb_cap_lookup(v
>             num_hpets_used, num_chs);
>  }
>  
> +static DEFINE_PER_CPU(struct hpet_event_channel *, lru_channel);
> +
>  static struct hpet_event_channel *hpet_get_channel(unsigned int cpu)
>  {
>      static unsigned int next_channel;
> @@ -454,9 +456,21 @@ static struct hpet_event_channel *hpet_g
>      if ( num_hpets_used >= nr_cpu_ids )
>          return &hpet_events[cpu];
>  
> +    /*
> +     * Try the least recently used channel first.  It may still have its 
> IRQ's
> +     * affinity set to the desired CPU.  This way we also limit having 
> multiple
> +     * of our IRQs raised on the same CPU, in possibly a nested manner.
> +     */
> +    ch = per_cpu(lru_channel, cpu);
> +    if ( ch && !test_and_set_bit(HPET_EVT_USED_BIT, &ch->flags) )
> +    {
> +        ch->cpu = cpu;
> +        return ch;
> +    }
> +
> +    /* Then look for an unused channel. */
>      next = arch_fetch_and_add(&next_channel, 1) % num_hpets_used;
>  
> -    /* try unused channel first */
>      for ( i = next; i < next + num_hpets_used; i++ )
>      {
>          ch = &hpet_events[i % num_hpets_used];
> @@ -479,6 +493,8 @@ static void set_channel_irq_affinity(str
>  {
>      struct irq_desc *desc = irq_to_desc(ch->msi.irq);
>  
> +    per_cpu(lru_channel, ch->cpu) = ch;
> +
>      ASSERT(!local_irq_is_enabled());
>      spin_lock(&desc->lock);
>      hpet_msi_mask(desc);

Maybe I'm missing the point here, but you are resetting the MSI
affinity anyway here, so there isn't much point in attempting to
re-use the same channel when Xen still unconditionally goes through the
process of setting the affinity anyway?

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.