[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Bug report and patch about IRQ freezing after gic_restore_state
On Mon, 20 May 2013, Julien Grall wrote: > On 05/20/2013 01:41 AM, Jaeyong Yoo wrote: > > Hello, > > > I'm running xen on Arndale board and if I run both iperf and du command at > > Dom0, > > one of IRQ (either SATA or network) suddenly stop occuring anymore. > > After some investigation, I found out that when context switching at Xen, > > IRQs in LR (about to be delivered to Doms) could be lost and never occur > > anymore. > > Here goes function call sequence that this problem occurs: > > (in context switching) > > - schedule_tail > > - ctxt_switch_from > > - local_irq_enable > > - // after this part, some IRQ can occur and could be directly > > written to LR > > - ctxt_switch_to > > - ... (some more functions) > > - // before the above IRQ is delivered to Dom (and maintenance > > IRQ not called), > > // gic_restore_state can be called > > - gic_restore_state /* when restoring gic state, the above IRQ > > * (written to LR) is overwritten > > * to the previous values, and > > somehow, > > * the corresponding IRQ never occur > > again */ > > > > I made the following patch (i.e., enable local irq after gic_restore_state) > > for preventing the above problem. > > Thanks for the patch, I was looking with a similar error on the Arndale > Board for a couple of day. Indeed, thanks for the analysis of the bug and the patch! It is a particularly difficult bug to track down because it can only happen if an irq arrives after ctxt_switch_from and before ctxt_switch_to, and the irq is for the next vcpu to be scheduled on the pcpu (otherwise the v == current check at the beginning of gic_set_guest_irq would catch that). Rather than extending the check in gic_set_guest_irq, I think it is wise to run ctxt_switch_to with interrupts disabled. > > Signed-off-by: Jaeyong Yoo <jaeyong.yoo@xxxxxxxxxxx> > > --- > > xen/arch/arm/domain.c | 4 ++-- > > xen/arch/arm/gic.c | 4 ++-- > > 2 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > > index f71b582..2c3b132 100644 > > --- a/xen/arch/arm/domain.c > > +++ b/xen/arch/arm/domain.c > > @@ -141,6 +141,8 @@ static void ctxt_switch_to(struct vcpu *n) > > /* VGIC */ > > gic_restore_state(n); > > + local_irq_enable(); > > + > > Could you move the local_irq_enable right after ctxt_switch_to? Right, good idea. > > /* XXX VFP */ > > /* XXX MPU */ > > @@ -215,8 +217,6 @@ static void schedule_tail(struct vcpu *prev) > > { > > ctxt_switch_from(prev); > > - local_irq_enable(); > > - > > /* TODO > > update_runstate_area(current); > > */ > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > > index d4f0a43..8186ad8 100644 > > --- a/xen/arch/arm/gic.c > > +++ b/xen/arch/arm/gic.c > > @@ -81,11 +81,11 @@ void gic_restore_state(struct vcpu *v) > > if ( is_idle_vcpu(v) ) > > return; > > - spin_lock_irq(&gic.lock); > > + spin_lock(&gic.lock); > > this_cpu(lr_mask) = v->arch.lr_mask; > > for ( i=0; i<nr_lrs; i++) > > GICH[GICH_LR + i] = v->arch.gic_lr[i]; > > - spin_unlock_irq(&gic.lock); > > + spin_unlock(&gic.lock); > > As the IRQ is disabled and the GICH registers can only be modified by > the current physical CPU, I think you can remove the spin_{,un}lock and > replace it by a dsb. Yes, we can remove the spin_lock but I don't think we need a dsb there. See the presence of an isb() two lines below. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |