[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 for-4.19 2/3] x86/irq: handle moving interrupts in _assign_irq_vector()



On Mon, Jun 17, 2024 at 03:31:13PM +0200, Jan Beulich wrote:
> On 13.06.2024 18:56, Roger Pau Monne wrote:
> > Currently there's logic in fixup_irqs() that attempts to prevent
> > _assign_irq_vector() from failing, as fixup_irqs() is required to evacuate 
> > all
> > interrupts from the CPUs not present in the input mask.  The current logic 
> > in
> > fixup_irqs() is incomplete, as it doesn't deal with interrupts that have
> > move_cleanup_count > 0 and a non-empty ->arch.old_cpu_mask field.
> > 
> > Instead of attempting to fixup the interrupt descriptor in fixup_irqs() so 
> > that
> > _assign_irq_vector() cannot fail, introduce logic in _assign_irq_vector()
> > to deal with interrupts that have either move_{in_progress,cleanup_count} 
> > set
> > and no remaining online CPUs in ->arch.cpu_mask.
> > 
> > If _assign_irq_vector() is requested to move an interrupt in the state
> > described above, first attempt to see if ->arch.old_cpu_mask contains any 
> > valid
> > CPUs that could be used as fallback, and if that's the case do move the
> > interrupt back to the previous destination.  Note this is easier because the
> > vector hasn't been released yet, so there's no need to allocate and setup a 
> > new
> > vector on the destination.
> > 
> > Due to the logic in fixup_irqs() that clears offline CPUs from
> > ->arch.old_cpu_mask (and releases the old vector if the mask becomes empty) 
> > it
> > shouldn't be possible to get into _assign_irq_vector() with
> > ->arch.move_{in_progress,cleanup_count} set but no online CPUs in
> > ->arch.old_cpu_mask.
> > 
> > However if ->arch.move_{in_progress,cleanup_count} is set and the interrupt 
> > has
> > also changed affinity, it's possible the members of ->arch.old_cpu_mask are 
> > no
> > longer part of the affinity set, move the interrupt to a different CPU part 
> > of
> > the provided mask and keep the current ->arch.old_{cpu_mask,vector} for the
> > pending interrupt movement to be completed.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
> 
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -544,7 +544,58 @@ static int _assign_irq_vector(struct irq_desc *desc, 
> > const cpumask_t *mask)
> >      }
> >  
> >      if ( desc->arch.move_in_progress || desc->arch.move_cleanup_count )
> > -        return -EAGAIN;
> > +    {
> > +        /*
> > +         * If the current destination is online refuse to shuffle.  Retry 
> > after
> > +         * the in-progress movement has finished.
> > +         */
> > +        if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) )
> > +            return -EAGAIN;
> > +
> > +        /*
> > +         * Due to the logic in fixup_irqs() that clears offlined CPUs from
> > +         * ->arch.old_cpu_mask it shouldn't be possible to get here with
> > +         * ->arch.move_{in_progress,cleanup_count} set and no online CPUs 
> > in
> > +         * ->arch.old_cpu_mask.
> > +         */
> > +        ASSERT(valid_irq_vector(desc->arch.old_vector));
> > +        ASSERT(cpumask_intersects(desc->arch.old_cpu_mask, 
> > &cpu_online_map));
> > +
> > +        if ( cpumask_intersects(desc->arch.old_cpu_mask, mask) )
> > +        {
> > +            /*
> > +             * Fallback to the old destination if moving is in progress 
> > and the
> > +             * current destination is to be offlined.  This is only 
> > possible if
> > +             * the CPUs in old_cpu_mask intersect with the affinity mask 
> > passed
> > +             * in the 'mask' parameter.
> > +             */
> > +            desc->arch.vector = desc->arch.old_vector;
> 
> I'm a little puzzled that you use desc->arch.old_vector here, but ...

old_vector can't be used here, as old_vector == desc->arch.vector at
this point.

The name of the variable is IMO a bit misleading, as it's value only
becomes the old_vector once a new vector is assigned.  It would be
more appropriate to name it current_vector IMO.

> > +            cpumask_and(desc->arch.cpu_mask, desc->arch.old_cpu_mask, 
> > mask);
> > +
> > +            /* Undo any possibly done cleanup. */
> > +            for_each_cpu(cpu, desc->arch.cpu_mask)
> > +                per_cpu(vector_irq, cpu)[desc->arch.vector] = irq;
> > +
> > +            /* Cancel the pending move and release the current vector. */
> > +            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> > +            cpumask_clear(desc->arch.old_cpu_mask);
> > +            desc->arch.move_in_progress = 0;
> > +            desc->arch.move_cleanup_count = 0;
> > +            if ( desc->arch.used_vectors )
> > +            {
> > +                ASSERT(test_bit(old_vector, desc->arch.used_vectors));
> > +                clear_bit(old_vector, desc->arch.used_vectors);
> 
> ... old_vector here. Since we have the latter, uniformly using it might
> be more consistent.

Keep in mind that old_vector a cache of the value of desc->arch.vector
at the start of the function.

> I realize though that irq_to_vector() has cases where
> it wouldn't return desc->arch.old_vector; I think, however, that in those
> case we can't make it here. Still I'm not going to insist on making the
> adjustment. Happy to make it though while committing, should you agree.
> 
> Also I'm not happy to see another instance of this pattern appear. In
> x86-specific code this is inefficient, as {set,clear}_bit resolve to the
> same insn as test_and_{set,clear}_bit(). Therefore imo more efficient
> would be
> 
>                 if (!test_and_clear_bit(old_vector, desc->arch.used_vectors))
>                     ASSERT_UNREACHABLE();
> 
> (and then the two if()s folded).
> 
> I've been meaning to propose a patch to the other similar sites ...

Oh, indeed.  IIRC I've copied this pattern from somewhere else without
noticing.  I'm happy for you to fold the test_and_clear_bit() call
into the parent if condition.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.