|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin
On Fri, 2018-04-13 at 08:23 +0200, Olaf Hering wrote:
> Am Thu, 12 Apr 2018 19:25:43 +0200
> schrieb Dario Faggioli <dfaggioli@xxxxxxxx>:
>
> > Olaf, new patch! :-)
>
> BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc)));
>
Thanks!
> (XEN) CPU 36: d10v1 isr=0 runnbl=1 proc=36 pf=0 orq=0 csf=4
>
So, FTR:
- CPU is smp_processor_id()
- dXvY is prev, in context_saved()
- isr is prev->is_running
- runnbl is vcpu_runnable(prev)
- proc is prev->processor
- pf is pref->pause_flags
- orq is __vcpu_on_runq(CSCHED_VCPU(prev)) (coming from sched_credit.c)
- csf is CSCHED_VCPU(prev)->flags
csf = 4 is CSCHED_FLAG_VCPU_MIGRATING, which means someone is calling
vcpu_migrate() on prev, on some other processor (presumably via
vcpu_set_affinity()) and is around here:
static void vcpu_migrate(struct vcpu *v)
{
...
if ( v->is_running ||
!test_and_clear_bit(_VPF_migrating, &v->pause_flags) )
{
sched_spin_unlock_double(old_lock, new_lock, flags);
return;
}
vcpu_move_locked(v, new_cpu);
sched_spin_unlock_double(old_lock, new_lock, flags);
[**]
if ( old_cpu != new_cpu )
sched_move_irqs(v);
/* Wake on new CPU. */
vcpu_wake(v);
}
I.e., SCHED_OP(pick_cpu) has been called already, but not vcpu_wake().
We must be past the sched_spin_unlock_double() because, on this
processor (i.e., CPU 31 in this crash), we are, while printing,
_inside_ a critical section on prev's scheduler lock.
> (XEN) CPU 33: d10v2 isr=0 runnbl=0 proc=33 pf=1 orq=0 csf=4
> (XEN) CPU 20: d10v2 isr=0 runnbl=1 proc=20 pf=0 orq=0 csf=4
> (XEN) CPU 32: d10v0 isr=0 runnbl=1 proc=32 pf=0 orq=0 csf=4
> (XEN) CPU 33: d10v0 isr=0 runnbl=1 proc=12 pf=0 orq=0 csf=4
> (XEN) CPU 36: d10v0 isr=0 runnbl=1 proc=36 pf=0 orq=0 csf=4
> (XEN) CPU 31: d10v0 isr=0 runnbl=1 proc=31 pf=0 orq=0 csf=4
> (XEN) Xen BUG at sched_credit.c:877
> (XEN) ----[ Xen-4.11.20180411T100655.82540b66ce-
> 180413055758 x86_64 debug=y Not tainted ]----
> (XEN) CPU: 31
>
Right, so, in this case, the vcpu_migrate()->SCHED_OP(pick_cpu) did not
change prev->processor. That could very well have happened. This just
means that, if it weren't for the BUG_ON added in csched_vcpu_migrate()
by this patch, this iteration would not have crashed in
csched_load_balance().
However, in previous report, we have seen a situation where
prev->processor was 31 on CPU 16.
Fact is, VPF_migrating is 0 right now, for prev, which corroborates the
theory that we are at [*] point, in vcpu_migrate() on the other CPU. In
fact, it was 1, but !test_and_clear_bit() has been called to reset it.
However, in order for us, on this CPU, to actually execute
sched_move_locked(), like we do:
> (XEN) Xen call trace:
> (XEN) [<ffff82d08022c84d>]
> sched_credit.c#csched_vcpu_migrate+0x52/0x54
> (XEN) [<ffff82d080239419>] schedule.c#vcpu_move_locked+0x42/0xcc
>
It means that someone raise VPF_migrating again!
> (XEN) [<ffff82d08023a8d8>] schedule.c#vcpu_migrate+0x210/0x23b
> (XEN) [<ffff82d08023c7ad>] context_saved+0x236/0x479
> (XEN) [<ffff82d08027a558>] context_switch+0xe9/0xf67
> (XEN) [<ffff82d0802397a9>] schedule.c#schedule+0x306/0x6ab
> (XEN) [<ffff82d08023d56a>] softirq.c#__do_softirq+0x71/0x9a
> (XEN) [<ffff82d08023d5dd>] do_softirq+0x13/0x15
> (XEN) [<ffff82d080328d8b>] vmx_asm_do_vmentry+0x2b/0x30
> (XEN) ****************************************
> (XEN) Panic on CPU 31:
> (XEN) Xen BUG at sched_credit.c:877
> (XEN) ****************************************
>
Now, VPF_migrating is raise in the following circumstances:
* in __runq_tickle(): I actually was about to pinpoint this as the
problem, but then I realized that, when calling __runq_tickle(prev),
in vcpu_wake() (called by vcpu_migrate()), we do not set the bit on
prev itself, but on the currently running vcpu of prev->processor.
And a vcpu that is in per_cpu(schedule_data, <CPU>).curr, can't
also be prev in (any) context_saved(), I think.
* in csched_vcpu_acct(): we set the flag on CSCHED_VCPU(current). I
may be wrong, but I don't immediatly see why we use current here,
instead than curr_on_cpu(cpu). Yet, I think that, similarly to
above, current can't be prev. Still, I may send a "Just in case"^TM
patch... :-P
* in vcpu_force_reschedule(): it's used in shim code (well... :-) and
in VCPUOP_set_periodic_timer(). But it only sets the flag if
prev->is_running is 1, which is not. Besides, don't most guests use
singleshot timer only these days?
* in cpu_disable_scheduler(): no. Just no.
* in vcpu_set_affinity(): well, it looks to me that, either, a) we use
the set of the bit in here to actually enter the if() in
context_saved(), which is a precondition for the race, and then we
are already past that or, b) things just work. Will think more...
* in vcpu_pin_override(): again, no.... I think?
So, thoughts? :-)
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |