[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen 4.5 random freeze question
On Wed, Nov 19, 2014 at 6:13 PM, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx> wrote: > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: >> On Wed, Nov 19, 2014 at 6:01 PM, Andrii Tseglytskyi >> <andrii.tseglytskyi@xxxxxxxxxxxxxxx> wrote: >> > On Wed, Nov 19, 2014 at 5:41 PM, Stefano Stabellini >> > <stefano.stabellini@xxxxxxxxxxxxx> wrote: >> >> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: >> >>> Hi Stefano, >> >>> >> >>> On Wed, Nov 19, 2014 at 4:52 PM, Stefano Stabellini >> >>> <stefano.stabellini@xxxxxxxxxxxxx> wrote: >> >>> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: >> >>> >> Hi Stefano, >> >>> >> >> >>> >> > > if ( !list_empty(¤t->arch.vgic.lr_pending) && >> >>> >> > > lr_all_full() ) >> >>> >> > > - GICH[GICH_HCR] |= GICH_HCR_UIE; >> >>> >> > > + GICH[GICH_HCR] |= GICH_HCR_NPIE; >> >>> >> > > else >> >>> >> > > - GICH[GICH_HCR] &= ~GICH_HCR_UIE; >> >>> >> > > + GICH[GICH_HCR] &= ~GICH_HCR_NPIE; >> >>> >> > > >> >>> >> > > } >> >>> >> > >> >>> >> > Yes, exactly >> >>> >> >> >>> >> I tried, hang still occurs with this change >> >>> > >> >>> > We need to figure out why during the hang you still have all the LRs >> >>> > busy even if you are getting maintenance interrupts that should cause >> >>> > them to be cleared. >> >>> > >> >>> >> >>> I see that I have free LRs during maintenance interrupt >> >>> >> >>> (XEN) gic.c:871:d0v0 maintenance interrupt >> >>> (XEN) GICH_LRs (vcpu 0) mask=0 >> >>> (XEN) HW_LR[0]=9a015856 >> >>> (XEN) HW_LR[1]=0 >> >>> (XEN) HW_LR[2]=0 >> >>> (XEN) HW_LR[3]=0 >> >>> (XEN) Inflight irq=86 lr=0 >> >>> (XEN) Inflight irq=2 lr=255 >> >>> (XEN) Pending irq=2 >> >>> >> >>> But I see that after I got hang - maintenance interrupts are generated >> >>> continuously. Platform continues printing the same log till reboot. >> >> >> >> Exactly the same log? As in the one above you just pasted? >> >> That is very very suspicious. >> > >> > Yes exactly the same log. And looks like it means that LRs are flushed >> > correctly. >> > >> >> >> >> I am thinking that we are not handling GICH_HCR_UIE correctly and >> >> something we do in Xen, maybe writing to an LR register, might trigger a >> >> new maintenance interrupt immediately causing an infinite loop. >> >> >> > >> > Yes, this is what I'm thinking about. Taking in account all collected >> > debug info it looks like once LRs are overloaded with SGIs - >> > maintenance interrupt occurs. >> > And then it is not handled properly, and occurs again and again - so >> > platform hangs inside its handler. >> > >> >> Could you please try this patch? It disable GICH_HCR_UIE immediately on >> >> hypervisor entry. >> >> >> > >> > Now trying. >> > >> >> >> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c >> >> index 4d2a92d..6ae8dc4 100644 >> >> --- a/xen/arch/arm/gic.c >> >> +++ b/xen/arch/arm/gic.c >> >> @@ -701,6 +701,8 @@ void gic_clear_lrs(struct vcpu *v) >> >> if ( is_idle_vcpu(v) ) >> >> return; >> >> >> >> + GICH[GICH_HCR] &= ~GICH_HCR_UIE; >> >> + >> >> spin_lock_irqsave(&v->arch.vgic.lock, flags); >> >> >> >> while ((i = find_next_bit((const unsigned long *) &this_cpu(lr_mask), >> >> @@ -821,12 +823,8 @@ void gic_inject(void) >> >> >> >> gic_restore_pending_irqs(current); >> >> >> >> - >> >> if ( !list_empty(¤t->arch.vgic.lr_pending) && lr_all_full() ) >> >> GICH[GICH_HCR] |= GICH_HCR_UIE; >> >> - else >> >> - GICH[GICH_HCR] &= ~GICH_HCR_UIE; >> >> - >> >> } >> >> >> >> static void do_sgi(struct cpu_user_regs *regs, int othercpu, enum >> >> gic_sgi sgi) >> > >> >> Heh - I don't see hangs with this patch :) But also I see that >> maintenance interrupt doesn't occur (and no hang as result) >> Stefano - is this expected? > > No maintenance interrupts at all? That's strange. You should be > receiving them when LRs are full and you still have interrupts pending > to be added to them. > > You could add another printk here to see if you should be receiving > them: > > if ( !list_empty(¤t->arch.vgic.lr_pending) && lr_all_full() ) > + { > + gdprintk(XENLOG_DEBUG, "requesting maintenance interrupt\n"); > GICH[GICH_HCR] |= GICH_HCR_UIE; > - else > - GICH[GICH_HCR] &= ~GICH_HCR_UIE; > - > + } > } > Requested properly: (XEN) gic.c:756:d0v0 requesting maintenance interrupt (XEN) gic.c:756:d0v0 requesting maintenance interrupt (XEN) gic.c:756:d0v0 requesting maintenance interrupt (XEN) gic.c:756:d0v0 requesting maintenance interrupt (XEN) gic.c:756:d0v0 requesting maintenance interrupt (XEN) gic.c:756:d0v0 requesting maintenance interrupt (XEN) gic.c:756:d0v0 requesting maintenance interrupt But does not occur > >> > >> > >> > -- >> > >> > Andrii Tseglytskyi | Embedded Dev >> > GlobalLogic >> > www.globallogic.com >> >> >> >> -- >> >> Andrii Tseglytskyi | Embedded Dev >> GlobalLogic >> www.globallogic.com >> -- Andrii Tseglytskyi | Embedded Dev GlobalLogic www.globallogic.com _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |