|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC] Xen crashes on ASSERT on suspend/resume, suggested fix
On Thu, 25 May 2023, Jan Beulich wrote:
> On 25.05.2023 01:51, Stefano Stabellini wrote:
> > xen/irq: fix races between send_cleanup_vector and _clear_irq_vector
>
> This title is, I'm afraid, already misleading. No such race can occur
> afaict, as both callers of _clear_irq_vector() acquire the IRQ
> descriptor lock first, and irq_complete_move() (the sole caller of
> send_cleanup_vector()) is only ever invoked as or by an ->ack()
> hook, which in turn is only invoked with, again, the descriptor lock
> held.
Yes I see that you are right about the locking, and thank you for taking
the time to look into it.
One last question: could it be that a second interrupt arrives while
->ack() is being handled? do_IRQ() is running with interrupts disabled?
> > It is possible that send_cleanup_vector and _clear_irq_vector are
> > running at the same time on different CPUs. In that case we have a race
> > as both _clear_irq_vector and irq_move_cleanup_interrupt are trying to
> > clear old_vector.
> >
> > This patch fixes 3 races:
> >
> > 1) As irq_move_cleanup_interrupt is running on multiple CPUs at the
> > same time, and also _clear_irq_vector is running, it is possible that
> > only some per_cpu(vector_irq, cpu)[old_vector] are valid but not all.
> > So, turn the ASSERT in _clear_irq_vector into an if.
>
> Note again the locking which is in effect.
>
> > 2) It is possible that _clear_irq_vector is running at the same time as
> > release_old_vec, called from irq_move_cleanup_interrupt. At the moment,
> > it is possible for _clear_irq_vector to read a valid old_cpu_mask but an
> > invalid old_vector (because it is being set to invalid by
> > release_old_vec). To avoid this problem in release_old_vec move clearing
> > old_cpu_mask before setting old_vector to invalid. This way, we know that
> > in _clear_irq_vector if old_vector is invalid also old_cpu_mask is zero
> > and we don't enter the loop.
>
> All invocations of release_old_vec() are similarly inside suitably
> locked regions.
>
> > 3) It is possible that release_old_vec is running twice at the same time
> > for the same old_vector. Change the code in release_old_vec to make it
> > OK to call it twice. Remove both ASSERTs. With those gone, it should be
> > possible now to call release_old_vec twice in a row for the same
> > old_vector.
>
> Same here.
>
> Any such issues would surface more frequently and without any suspend /
> resume involved. What is still missing is that connection, and only then
> it'll (or really: may) become clear what needs adjusting. If you've seen
> the issue exactly once, then I'm afraid there's not much we can do unless
> someone can come up with a plausible explanation of something being
> broken on any of the involved code paths. More information will need to
> be gathered out of the next occurrence of this, whenever that's going to
> be. One of the things we will want to know, as mentioned before, is the
> value that per_cpu(vector_irq, cpu)[old_vector] has when the assertion
> triggers. Iirc Roger did suggest another piece of data you'd want to log.
Understood, thanks for the explanation.
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -112,16 +112,11 @@ static void release_old_vec(struct irq_desc *desc)
> > {
> > unsigned int vector = desc->arch.old_vector;
> >
> > - desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> > cpumask_clear(desc->arch.old_cpu_mask);
> > + desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> >
> > - if ( !valid_irq_vector(vector) )
> > - ASSERT_UNREACHABLE();
> > - else if ( desc->arch.used_vectors )
> > - {
> > - ASSERT(test_bit(vector, desc->arch.used_vectors));
> > + if ( desc->arch.used_vectors )
> > clear_bit(vector, desc->arch.used_vectors);
> > - }
> > }
> >
> > static void _trace_irq_mask(uint32_t event, int irq, int vector,
> > @@ -230,9 +225,11 @@ static void _clear_irq_vector(struct irq_desc *desc)
> >
> > for_each_cpu(cpu, tmp_mask)
> > {
> > - ASSERT(per_cpu(vector_irq, cpu)[old_vector] == irq);
> > - TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
> > - per_cpu(vector_irq, cpu)[old_vector] = ~irq;
> > + if ( per_cpu(vector_irq, cpu)[old_vector] == irq )
> > + {
> > + TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
> > + per_cpu(vector_irq, cpu)[old_vector] = ~irq;
> > + }
> > }
>
> As said before - replacing ASSERT() by a respective if() cannot really
> be done without discussing the "else" in the description. Except of
> course in trivial/obvious cases, but I think we agree here we don't
> have such a case.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |