[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] rcu_sched self-detect stall when disable vif device

On 27/01/15 16:53, Wei Liu wrote:
> On Tue, Jan 27, 2015 at 04:47:45PM +0000, Julien Grall wrote:
>> On 27/01/15 16:45, Wei Liu wrote:
>>> On Tue, Jan 27, 2015 at 04:03:52PM +0000, Julien Grall wrote:
>>>> Hi,
>>>> While I'm working on support for 64K page in netfront, I got
>>>> an rcu_sced self-detect message. It happens when netback is
>>>> disabling the vif device due to an error.
>>>> I'm using Linux 3.19-rc5 on seattle (ARM64). Any idea why
>>>> the processor is stucked in xenvif_rx_queue_purge?
>>> When you try to release a SKB, core network driver need to enter some
>>> RCU cirital region to clean up. dst_release for one, calls call_rcu.
>> But this message shouldn't happen in normal condition or because of
>> netfront. Right?
> Never saw  report like this before, even in the case that netfront is
> buggy.

This is only happening when preemption is not enabled (i.e
CONFIG_PREEMPT_NONE in the config file) in the backend kernel.

When the vif is disabled, the loop in xenvif_kthread_guest_rx turned
into an infinite loop. In my case, the code executed looks like:

 1. for (;;) {
 2.     xenvif_wait_for_rx_work(queue);
 4.     if (kthread_should_stop())
 5.         break;
 7.     if (unlikely(vif->disabled && queue->id == 0) {
 8.             xenvif_carrier_off(vif);
 9.             xenvif_rx_queue_purge(queue);
10.             continue;
11.     }
12. }

The wait on line 2 will return directly because the vif is disabled
(see xenvif_have_rx_work)

We are on queue 0, so the condition on line 7 is true. Therefore we will
loop on line 10. And so on...

On platform where preemption is not enabled, this thread will never
yield/give the hand to another thread (unless the domain is destroyed).


Julien Grall

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.