[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 6/7] x86/irq: handle moving interrupts in _assign_irq_vector()



On Thu, Jun 13, 2024 at 10:38:35AM +0200, Jan Beulich wrote:
> On 12.06.2024 17:36, Roger Pau Monné wrote:
> > On Wed, Jun 12, 2024 at 03:42:58PM +0200, Jan Beulich wrote:
> >> On 12.06.2024 12:39, Roger Pau Monné wrote:
> >>> On Tue, Jun 11, 2024 at 03:18:32PM +0200, Jan Beulich wrote:
> >>>> On 10.06.2024 16:20, Roger Pau Monne wrote:
> >>>>> Currently there's logic in fixup_irqs() that attempts to prevent
> >>>>> _assign_irq_vector() from failing, as fixup_irqs() is required to 
> >>>>> evacuate all
> >>>>> interrupts from the CPUs not present in the input mask.  The current 
> >>>>> logic in
> >>>>> fixup_irqs() is incomplete, as it doesn't deal with interrupts that have
> >>>>> move_cleanup_count > 0 and a non-empty ->arch.old_cpu_mask field.
> >>>>>
> >>>>> Instead of attempting to fixup the interrupt descriptor in fixup_irqs() 
> >>>>> so that
> >>>>> _assign_irq_vector() cannot fail, introduce logic in 
> >>>>> _assign_irq_vector()
> >>>>> to deal with interrupts that have either 
> >>>>> move_{in_progress,cleanup_count} set
> >>>>> and no remaining online CPUs in ->arch.cpu_mask.
> >>>>>
> >>>>> If _assign_irq_vector() is requested to move an interrupt in the state
> >>>>> described above, first attempt to see if ->arch.old_cpu_mask contains 
> >>>>> any valid
> >>>>> CPUs that could be used as fallback, and if that's the case do move the
> >>>>> interrupt back to the previous destination.  Note this is easier 
> >>>>> because the
> >>>>> vector hasn't been released yet, so there's no need to allocate and 
> >>>>> setup a new
> >>>>> vector on the destination.
> >>>>>
> >>>>> Due to the logic in fixup_irqs() that clears offline CPUs from
> >>>>> ->arch.old_cpu_mask (and releases the old vector if the mask becomes 
> >>>>> empty) it
> >>>>> shouldn't be possible to get into _assign_irq_vector() with
> >>>>> ->arch.move_{in_progress,cleanup_count} set but no online CPUs in
> >>>>> ->arch.old_cpu_mask.
> >>>>>
> >>>>> However if ->arch.move_{in_progress,cleanup_count} is set and the 
> >>>>> interrupt has
> >>>>> also changed affinity, it's possible the members of ->arch.old_cpu_mask 
> >>>>> are no
> >>>>> longer part of the affinity set,
> >>>>
> >>>> I'm having trouble relating this (->arch.old_cpu_mask related) to ...
> >>>>
> >>>>> move the interrupt to a different CPU part of
> >>>>> the provided mask
> >>>>
> >>>> ... this (->arch.cpu_mask related).
> >>>
> >>> No, the "provided mask" here is the "mask" parameter, not
> >>> ->arch.cpu_mask.
> >>
> >> Oh, so this describes the case of "hitting" the comment at the very bottom 
> >> of
> >> the first hunk then? (I probably was misreading this because I was 
> >> expecting
> >> it to describe a code change, rather than the case where original behavior
> >> needs retaining. IOW - all fine here then.)
> >>
> >>>>> and keep the current ->arch.old_{cpu_mask,vector} for the
> >>>>> pending interrupt movement to be completed.
> >>>>
> >>>> Right, that's to clean up state from before the initial move. What isn't
> >>>> clear to me is what's to happen with the state of the intermediate
> >>>> placement. Description and code changes leave me with the impression that
> >>>> it's okay to simply abandon, without any cleanup, yet I can't quite 
> >>>> figure
> >>>> why that would be an okay thing to do.
> >>>
> >>> There isn't much we can do with the intermediate placement, as the CPU
> >>> is going offline.  However we can drain any pending interrupts from
> >>> IRR after the new destination has been set, since setting the
> >>> destination is done from the CPU that's the current target of the
> >>> interrupts.  So we can ensure the draining is done strictly after the
> >>> target has been switched, hence ensuring no further interrupts from
> >>> this source will be delivered to the current CPU.
> >>
> >> Hmm, I'm afraid I still don't follow: I'm specifically in trouble with
> >> the ...
> >>
> >>>>> --- a/xen/arch/x86/irq.c
> >>>>> +++ b/xen/arch/x86/irq.c
> >>>>> @@ -544,7 +544,53 @@ static int _assign_irq_vector(struct irq_desc 
> >>>>> *desc, const cpumask_t *mask)
> >>>>>      }
> >>>>>  
> >>>>>      if ( desc->arch.move_in_progress || desc->arch.move_cleanup_count )
> >>>>> -        return -EAGAIN;
> >>>>> +    {
> >>>>> +        /*
> >>>>> +         * If the current destination is online refuse to shuffle.  
> >>>>> Retry after
> >>>>> +         * the in-progress movement has finished.
> >>>>> +         */
> >>>>> +        if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) )
> >>>>> +            return -EAGAIN;
> >>>>> +
> >>>>> +        /*
> >>>>> +         * Due to the logic in fixup_irqs() that clears offlined CPUs 
> >>>>> from
> >>>>> +         * ->arch.old_cpu_mask it shouldn't be possible to get here 
> >>>>> with
> >>>>> +         * ->arch.move_{in_progress,cleanup_count} set and no online 
> >>>>> CPUs in
> >>>>> +         * ->arch.old_cpu_mask.
> >>>>> +         */
> >>>>> +        ASSERT(valid_irq_vector(desc->arch.old_vector));
> >>>>> +        ASSERT(cpumask_intersects(desc->arch.old_cpu_mask, 
> >>>>> &cpu_online_map));
> >>>>> +
> >>>>> +        if ( cpumask_intersects(desc->arch.old_cpu_mask, mask) )
> >>>>> +        {
> >>>>> +            /*
> >>>>> +             * Fallback to the old destination if moving is in 
> >>>>> progress and the
> >>>>> +             * current destination is to be offlined.  This is only 
> >>>>> possible if
> >>>>> +             * the CPUs in old_cpu_mask intersect with the affinity 
> >>>>> mask passed
> >>>>> +             * in the 'mask' parameter.
> >>>>> +             */
> >>>>> +            desc->arch.vector = desc->arch.old_vector;
> >>>>> +            cpumask_and(desc->arch.cpu_mask, desc->arch.old_cpu_mask, 
> >>>>> mask);
> >>
> >> ... replacing of vector (and associated mask), without any further 
> >> accounting.
> > 
> > It's quite likely I'm missing something here, but what further
> > accounting you would like to do?
> > 
> > The current target of the interrupt (->arch.cpu_mask previous to
> > cpumask_and()) is all going offline, so any attempt to set it in
> > ->arch.old_cpu_mask would just result in a stale (offline) CPU getting
> > set in ->arch.old_cpu_mask, which previous patches attempted to
> > solve.
> > 
> > Maybe by "further accounting" you meant something else not related to
> > ->arch.old_{cpu_mask,vector}?
> 
> Indeed. What I'm thinking of is what normally release_old_vec() would
> do (of which only desc->arch.used_vectors updating would appear to be
> relevant, seeing the CPU's going offline). The other one I was thinking
> of, updating vector_irq[], likely is also unnecessary, again because
> that's per-CPU data of a CPU going down.

I think updating vector_irq[] should be explicitly avoided, as doing
so would prevent us from correctly draining any pending interrupts
because the vector -> irq mapping would be broken when the interrupt
enable window at the bottom of fixup_irqs() is reached.

For used_vectors: we might clean it, I'm a bit worried however that at
some point we insert a check in do_IRQ() path that ensures the
vector_irq[] is inline with desc->arch.used_vectors, which would fail
for interrupts drained at the bottom of fixup_irqs().  Let me attempt
to clean the currently used vector from ->arch.used_vectors.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.