[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC DO-NOT-APPLY] x86/IRQ: don't pass offline CPU(s) to on_selected_cpus()



On Tue, Feb 11, 2025 at 10:38:41AM +0100, Jan Beulich wrote:
> on_selected_cpus() asserts that it's only passed online CPUs. Between
> __cpu_disable() removing a CPU from the online map and fixup_eoi()
> cleaning up for the CPU being offlined there's a window, though, where
> the CPU in question can still appear in an action's cpu_eoi_map. Remove
> offline CPUs, as they're (supposed to be) taken care of by fixup_eoi().
> 
> Move the entire tail of desc_guest_eoi() into a new helper, thus
> streamlining processinf also for other call sites when the sole
               ^ processing

> remaining CPU is the local one. Move and split the assertion, to be a
> functionally equivalent replacement for the earlier BUG_ON() in
> __pirq_guest_unbind(). Note that we can't use scratch_cpumask there, in
> particular in the context of a timer handler and with data held there
> needing to stay intact across process_pending_softirqs().

I would probably add the crash backtrace to the commit message.

> 
> Reported-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> While this builds okay, for now I didn't even consider testing it: Both
> aspects below (plus the ugly locking arrangement) speak against dealing
> with the issue this way, imo. Cc-ing REST rather than just x86 for this
> reason.

Indeed, the locking is specially problematic here, both for being
ugly, and because I assume EOI is a hot path.

> RFC: Don't we need {get,put}_cpu_maps() around quite a few more uses of
>      cpu_online_map (e.g. _clear_irq_vector() when called from
>      destroy_irq(), or about every caller of smp_call_function())?

Indeed, that's my understanding also.  Quite a lot of users of
cpu_online_mask likely requires wrapping around {get,put}_cpu_maps()
calls, otherwise they very likely incur in TOCTOU races.

>      Roger
>      suggests using stop-machine context for CPU {on,off}lining, but I
>      seem to vaguely recall that there was a reason not to do so at
>      least for the offlining part.

I would have to explore this more thoroughly, it does seems feasible
on first sight, but devil (and bugs) is in the details.

I don't think we want to go to the lengths of what you do below for
quite a lot of users of cpu_online_mask.

> 
> RFC: I doubt process_pending_softirqs() is okay to use from a timer
>      handler. (Callers of, in particular, {desc,pirq}_guest_eoi() would
>      also need checking for process_pending_softirqs() to be okay to use
>      in their contexts.)

Yeah, I would be worry of doing softirq processing from this context,
the more that (even if not the common case) I would be afraid of the
unbounded latency that process_pending_softirqs() could introduce.

> 
> --- unstable.orig/xen/arch/x86/irq.c  2021-04-08 11:10:47.000000000 +0200
> +++ unstable/xen/arch/x86/irq.c       2025-02-11 09:54:38.532567287 +0100
> @@ -6,6 +6,7 @@
>   */
>  
>  #include <xen/init.h>
> +#include <xen/cpu.h>
>  #include <xen/delay.h>
>  #include <xen/errno.h>
>  #include <xen/event.h>
> @@ -1183,7 +1184,7 @@ static inline void clear_pirq_eoi(struct
>      }
>  }
>  
> -static void cf_check set_eoi_ready(void *data);
> +static bool invoke_set_eoi_ready(struct irq_desc *desc, bool wait);
>  
>  static void cf_check irq_guest_eoi_timer_fn(void *data)
>  {
> @@ -1224,18 +1225,13 @@ static void cf_check irq_guest_eoi_timer
>  
>      switch ( action->ack_type )
>      {
> -        cpumask_t *cpu_eoi_map;
> -
>      case ACKTYPE_UNMASK:
>          if ( desc->handler->end )
>              desc->handler->end(desc, 0);
>          break;
>  
>      case ACKTYPE_EOI:
> -        cpu_eoi_map = this_cpu(scratch_cpumask);
> -        cpumask_copy(cpu_eoi_map, action->cpu_eoi_map);
> -        spin_unlock_irq(&desc->lock);
> -        on_selected_cpus(cpu_eoi_map, set_eoi_ready, desc, 0);
> +        invoke_set_eoi_ready(desc, false);
>          return;
>      }
>  
> @@ -1468,6 +1464,49 @@ static void cf_check set_eoi_ready(void
>      flush_ready_eoi();
>  }
>  
> +/*
> + * Locking is somewhat special here: In all cases the function is invoked 
> with
> + * desc's lock held.  When @wait is true, the function tries to avoid 
> unlocking.
> + * It always returns whether the lock is still held.
> + */
> +static bool invoke_set_eoi_ready(struct irq_desc *desc, bool wait)
> +{
> +    const irq_guest_action_t *action = guest_action(desc);
> +    cpumask_t cpu_eoi_map;
> +
> +    cpumask_copy(&cpu_eoi_map, action->cpu_eoi_map);
> +
> +    if ( __cpumask_test_and_clear_cpu(smp_processor_id(), &cpu_eoi_map) )
> +    {
> +        ASSERT(action->ack_type == ACKTYPE_EOI);
> +        __set_eoi_ready(desc);
> +        spin_unlock(&desc->lock);
> +        flush_ready_eoi();
> +        local_irq_enable();
> +    }
> +    else if ( wait && cpumask_empty(&cpu_eoi_map) )
> +        return true;
> +    else
> +    {
> +        ASSERT(action->ack_type == ACKTYPE_EOI);
> +        spin_unlock_irq(&desc->lock);
> +    }
> +
> +    if ( cpumask_empty(&cpu_eoi_map) )
> +        return false;
> +
> +    while ( !get_cpu_maps() )
> +        process_pending_softirqs();
> +
> +    cpumask_and(&cpu_eoi_map, &cpu_eoi_map, &cpu_online_map);
> +    if ( !cpumask_empty(&cpu_eoi_map) )
> +        on_selected_cpus(&cpu_eoi_map, set_eoi_ready, desc, wait);
> +
> +    put_cpu_maps();
> +
> +    return false;
> +}
> +
>  void pirq_guest_eoi(struct pirq *pirq)

Possibly for pirq_guest_eoi() when called from the PHYSDEVOP_eoi
hypercall we should propagate whether the EOI has been performed, and
otherwise use an hypercall continuation to retry?

As said above however, this approach is so ugly, and we would require
similar adjustments in other places that also use cpu_online_maks that
I would only consider going this route as last resort.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.