[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 3/3] x86/irq: forward pending interrupts to new destination in fixup_irqs()



On Mon, Jun 17, 2024 at 03:41:12PM +0200, Jan Beulich wrote:
> On 13.06.2024 18:56, Roger Pau Monne wrote:
> > fixup_irqs() is used to evacuate interrupts from to be offlined CPUs.  Given
> > the CPU is to become offline, the normal migration logic used by Xen where 
> > the
> > vector in the previous target(s) is left configured until the interrupt is
> > received on the new destination is not suitable.
> > 
> > Instead attempt to do as much as possible in order to prevent loosing
> > interrupts.  If fixup_irqs() is called from the CPU to be offlined (as is
> > currently the case)
> 
> Except (again) for smp_send_stop().

I guess I haven't worded this properly, the point I was trying to make
is that in the context of a CPU unplug fixup_irqs() is always called
from the CPU that's going offline.

What about:

"If fixup_irqs() is called from the CPU to be offlined (as is
currently the case for CPU hot unplug) ..."

> > attempt to forward pending vectors when interrupts that
> > target the current CPU are migrated to a different destination.
> > 
> > Additionally, for interrupts that have already been moved from the current 
> > CPU
> > prior to the call to fixup_irqs() but that haven't been delivered to the new
> > destination (iow: interrupts with move_in_progress set and the current CPU 
> > set
> > in ->arch.old_cpu_mask) also check whether the previous vector is pending 
> > and
> > forward it to the new destination.
> > 
> > This allows us to remove the window with interrupts enabled at the bottom of
> > fixup_irqs().  Such window wasn't safe anyway: references to the CPU to 
> > become
> > offline are removed from interrupts masks, but the per-CPU vector_irq[] 
> > array
> > is not updated to reflect those changes (as the CPU is going offline 
> > anyway).
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> >[...]
> > @@ -2686,11 +2705,27 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
> >          if ( desc->handler->disable )
> >              desc->handler->disable(desc);
> >  
> > +        /*
> > +         * If the current CPU is going offline and is (one of) the 
> > target(s) of
> > +         * the interrupt, signal to check whether there are any pending 
> > vectors
> > +         * to be handled in the local APIC after the interrupt has been 
> > moved.
> > +         */
> > +        if ( !cpu_online(cpu) && cpumask_test_cpu(cpu, 
> > desc->arch.cpu_mask) )
> > +            check_irr = true;
> > +
> >          if ( desc->handler->set_affinity )
> >              desc->handler->set_affinity(desc, affinity);
> >          else if ( !(warned++) )
> >              set_affinity = false;
> >  
> > +        if ( check_irr && apic_irr_read(vector) )
> > +            /*
> > +             * Forward pending interrupt to the new destination, this CPU 
> > is
> > +             * going offline and otherwise the interrupt would be lost.
> > +             */
> > +            send_IPI_mask(cpumask_of(cpumask_any(desc->arch.cpu_mask)),
> > +                          desc->arch.vector);
> 
> Hmm, IRR may become set right after the IRR read (unlike in the other cases,
> where new IRQs ought to be surfacing only at the new destination). Doesn't
> this want moving ...
> 
> >          if ( desc->handler->enable )
> >              desc->handler->enable(desc);
> 
> ... past the actual affinity change?

Hm, but the ->enable() hook is just unmasking the interrupt, the
actual affinity change is done in ->set_affinity(), and hence after
the call to ->set_affinity() no further interrupts should be delivered
to the CPU regardless of whether the source is masked?

Or is it possible for the device/interrupt controller to not switch to
use the new destination until the interrupt is unmasked, and hence
could have pending masked interrupts still using the old destination?
IIRC For MSI-X it's required that the device updates the destination
target once the entry is unmasked.

I'm happy to move it after the ->enable() hook, but would like to
better understand why this is required.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.