[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 6/7] x86/irq: handle moving interrupts in _assign_irq_vector()
On 12.06.2024 12:39, Roger Pau Monné wrote: > On Tue, Jun 11, 2024 at 03:18:32PM +0200, Jan Beulich wrote: >> On 10.06.2024 16:20, Roger Pau Monne wrote: >>> Currently there's logic in fixup_irqs() that attempts to prevent >>> _assign_irq_vector() from failing, as fixup_irqs() is required to evacuate >>> all >>> interrupts from the CPUs not present in the input mask. The current logic >>> in >>> fixup_irqs() is incomplete, as it doesn't deal with interrupts that have >>> move_cleanup_count > 0 and a non-empty ->arch.old_cpu_mask field. >>> >>> Instead of attempting to fixup the interrupt descriptor in fixup_irqs() so >>> that >>> _assign_irq_vector() cannot fail, introduce logic in _assign_irq_vector() >>> to deal with interrupts that have either move_{in_progress,cleanup_count} >>> set >>> and no remaining online CPUs in ->arch.cpu_mask. >>> >>> If _assign_irq_vector() is requested to move an interrupt in the state >>> described above, first attempt to see if ->arch.old_cpu_mask contains any >>> valid >>> CPUs that could be used as fallback, and if that's the case do move the >>> interrupt back to the previous destination. Note this is easier because the >>> vector hasn't been released yet, so there's no need to allocate and setup a >>> new >>> vector on the destination. >>> >>> Due to the logic in fixup_irqs() that clears offline CPUs from >>> ->arch.old_cpu_mask (and releases the old vector if the mask becomes empty) >>> it >>> shouldn't be possible to get into _assign_irq_vector() with >>> ->arch.move_{in_progress,cleanup_count} set but no online CPUs in >>> ->arch.old_cpu_mask. >>> >>> However if ->arch.move_{in_progress,cleanup_count} is set and the interrupt >>> has >>> also changed affinity, it's possible the members of ->arch.old_cpu_mask are >>> no >>> longer part of the affinity set, >> >> I'm having trouble relating this (->arch.old_cpu_mask related) to ... >> >>> move the interrupt to a different CPU part of >>> the provided mask >> >> ... this (->arch.cpu_mask related). > > No, the "provided mask" here is the "mask" parameter, not > ->arch.cpu_mask. Oh, so this describes the case of "hitting" the comment at the very bottom of the first hunk then? (I probably was misreading this because I was expecting it to describe a code change, rather than the case where original behavior needs retaining. IOW - all fine here then.) >>> and keep the current ->arch.old_{cpu_mask,vector} for the >>> pending interrupt movement to be completed. >> >> Right, that's to clean up state from before the initial move. What isn't >> clear to me is what's to happen with the state of the intermediate >> placement. Description and code changes leave me with the impression that >> it's okay to simply abandon, without any cleanup, yet I can't quite figure >> why that would be an okay thing to do. > > There isn't much we can do with the intermediate placement, as the CPU > is going offline. However we can drain any pending interrupts from > IRR after the new destination has been set, since setting the > destination is done from the CPU that's the current target of the > interrupts. So we can ensure the draining is done strictly after the > target has been switched, hence ensuring no further interrupts from > this source will be delivered to the current CPU. Hmm, I'm afraid I still don't follow: I'm specifically in trouble with the ... >>> --- a/xen/arch/x86/irq.c >>> +++ b/xen/arch/x86/irq.c >>> @@ -544,7 +544,53 @@ static int _assign_irq_vector(struct irq_desc *desc, >>> const cpumask_t *mask) >>> } >>> >>> if ( desc->arch.move_in_progress || desc->arch.move_cleanup_count ) >>> - return -EAGAIN; >>> + { >>> + /* >>> + * If the current destination is online refuse to shuffle. Retry >>> after >>> + * the in-progress movement has finished. >>> + */ >>> + if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) ) >>> + return -EAGAIN; >>> + >>> + /* >>> + * Due to the logic in fixup_irqs() that clears offlined CPUs from >>> + * ->arch.old_cpu_mask it shouldn't be possible to get here with >>> + * ->arch.move_{in_progress,cleanup_count} set and no online CPUs >>> in >>> + * ->arch.old_cpu_mask. >>> + */ >>> + ASSERT(valid_irq_vector(desc->arch.old_vector)); >>> + ASSERT(cpumask_intersects(desc->arch.old_cpu_mask, >>> &cpu_online_map)); >>> + >>> + if ( cpumask_intersects(desc->arch.old_cpu_mask, mask) ) >>> + { >>> + /* >>> + * Fallback to the old destination if moving is in progress >>> and the >>> + * current destination is to be offlined. This is only >>> possible if >>> + * the CPUs in old_cpu_mask intersect with the affinity mask >>> passed >>> + * in the 'mask' parameter. >>> + */ >>> + desc->arch.vector = desc->arch.old_vector; >>> + cpumask_and(desc->arch.cpu_mask, desc->arch.old_cpu_mask, >>> mask); ... replacing of vector (and associated mask), without any further accounting. >>> + /* Undo any possibly done cleanup. */ >>> + for_each_cpu(cpu, desc->arch.cpu_mask) >>> + per_cpu(vector_irq, cpu)[desc->arch.vector] = irq; >>> + >>> + /* Cancel the pending move. */ >>> + desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED; >>> + cpumask_clear(desc->arch.old_cpu_mask); >>> + desc->arch.move_in_progress = 0; >>> + desc->arch.move_cleanup_count = 0; >>> + >>> + return 0; >>> + } >> >> In how far is this guaranteed to respect the (new) affinity that was set, >> presumably having led to the movement in the first place? > > The 'mask' parameter should account for the new affinity, hence the > cpumask_intersects() check guarantees we are moving to a CPU still in > the affinity mask. Ah, right, I must have been confused. >>> @@ -600,7 +646,17 @@ next: >>> current_vector = vector; >>> current_offset = offset; >>> >>> - if ( valid_irq_vector(old_vector) ) >>> + if ( desc->arch.move_in_progress || desc->arch.move_cleanup_count ) >>> + { >>> + ASSERT(!cpumask_intersects(desc->arch.cpu_mask, >>> &cpu_online_map)); >>> + /* >>> + * Special case when evacuating an interrupt from a CPU to be >>> + * offlined and the interrupt was already in the process of >>> being >>> + * moved. Leave ->arch.old_{vector,cpu_mask} as-is and just >>> + * replace ->arch.{cpu_mask,vector} with the new destination. >>> + */ >> >> And where's the cleaning up of ->arch.old_* going to be taken care of then? > > Such cleaning will be handled normally by the interrupt still having > ->arch.move_{in_progress,cleanup_count} set. The CPUs in > ->arch.old_cpu_mask must not all be offline, otherwise the logic in > fixup_irqs() would have already released the old vector. Maybe add "Cleanup will be done normally" to the comment? Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |