|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen 4.5 random freeze question
No, it just means "spurious interrupt".
On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> Does number 1023 mean that maintenance interrupt is global?
>
> On Wed, Nov 19, 2014 at 7:03 PM, Andrii Tseglytskyi
> <andrii.tseglytskyi@xxxxxxxxxxxxxxx> wrote:
> > I got this strange log:
> >
> > (XEN) received maintenance interrupt irq=1023
> >
> > And platform does not hang due to this:
> > + hcr = GICH[GICH_HCR];
> > + if ( hcr & GICH_HCR_UIE )
> > + {
> > + GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> > + uie_on = 1;
> > + }
> >
> > On Wed, Nov 19, 2014 at 6:50 PM, Stefano Stabellini
> > <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>> On Wed, Nov 19, 2014 at 6:13 PM, Stefano Stabellini
> >>> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >>> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>> >> On Wed, Nov 19, 2014 at 6:01 PM, Andrii Tseglytskyi
> >>> >> <andrii.tseglytskyi@xxxxxxxxxxxxxxx> wrote:
> >>> >> > On Wed, Nov 19, 2014 at 5:41 PM, Stefano Stabellini
> >>> >> > <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >>> >> >> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>> >> >>> Hi Stefano,
> >>> >> >>>
> >>> >> >>> On Wed, Nov 19, 2014 at 4:52 PM, Stefano Stabellini
> >>> >> >>> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >>> >> >>> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>> >> >>> >> Hi Stefano,
> >>> >> >>> >>
> >>> >> >>> >> > > if ( !list_empty(¤t->arch.vgic.lr_pending) &&
> >>> >> >>> >> > > lr_all_full() )
> >>> >> >>> >> > > - GICH[GICH_HCR] |= GICH_HCR_UIE;
> >>> >> >>> >> > > + GICH[GICH_HCR] |= GICH_HCR_NPIE;
> >>> >> >>> >> > > else
> >>> >> >>> >> > > - GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>> >> >>> >> > > + GICH[GICH_HCR] &= ~GICH_HCR_NPIE;
> >>> >> >>> >> > >
> >>> >> >>> >> > > }
> >>> >> >>> >> >
> >>> >> >>> >> > Yes, exactly
> >>> >> >>> >>
> >>> >> >>> >> I tried, hang still occurs with this change
> >>> >> >>> >
> >>> >> >>> > We need to figure out why during the hang you still have all the
> >>> >> >>> > LRs
> >>> >> >>> > busy even if you are getting maintenance interrupts that should
> >>> >> >>> > cause
> >>> >> >>> > them to be cleared.
> >>> >> >>> >
> >>> >> >>>
> >>> >> >>> I see that I have free LRs during maintenance interrupt
> >>> >> >>>
> >>> >> >>> (XEN) gic.c:871:d0v0 maintenance interrupt
> >>> >> >>> (XEN) GICH_LRs (vcpu 0) mask=0
> >>> >> >>> (XEN) HW_LR[0]=9a015856
> >>> >> >>> (XEN) HW_LR[1]=0
> >>> >> >>> (XEN) HW_LR[2]=0
> >>> >> >>> (XEN) HW_LR[3]=0
> >>> >> >>> (XEN) Inflight irq=86 lr=0
> >>> >> >>> (XEN) Inflight irq=2 lr=255
> >>> >> >>> (XEN) Pending irq=2
> >>> >> >>>
> >>> >> >>> But I see that after I got hang - maintenance interrupts are
> >>> >> >>> generated
> >>> >> >>> continuously. Platform continues printing the same log till reboot.
> >>> >> >>
> >>> >> >> Exactly the same log? As in the one above you just pasted?
> >>> >> >> That is very very suspicious.
> >>> >> >
> >>> >> > Yes exactly the same log. And looks like it means that LRs are
> >>> >> > flushed
> >>> >> > correctly.
> >>> >> >
> >>> >> >>
> >>> >> >> I am thinking that we are not handling GICH_HCR_UIE correctly and
> >>> >> >> something we do in Xen, maybe writing to an LR register, might
> >>> >> >> trigger a
> >>> >> >> new maintenance interrupt immediately causing an infinite loop.
> >>> >> >>
> >>> >> >
> >>> >> > Yes, this is what I'm thinking about. Taking in account all collected
> >>> >> > debug info it looks like once LRs are overloaded with SGIs -
> >>> >> > maintenance interrupt occurs.
> >>> >> > And then it is not handled properly, and occurs again and again - so
> >>> >> > platform hangs inside its handler.
> >>> >> >
> >>> >> >> Could you please try this patch? It disable GICH_HCR_UIE
> >>> >> >> immediately on
> >>> >> >> hypervisor entry.
> >>> >> >>
> >>> >> >
> >>> >> > Now trying.
> >>> >> >
> >>> >> >>
> >>> >> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> >>> >> >> index 4d2a92d..6ae8dc4 100644
> >>> >> >> --- a/xen/arch/arm/gic.c
> >>> >> >> +++ b/xen/arch/arm/gic.c
> >>> >> >> @@ -701,6 +701,8 @@ void gic_clear_lrs(struct vcpu *v)
> >>> >> >> if ( is_idle_vcpu(v) )
> >>> >> >> return;
> >>> >> >>
> >>> >> >> + GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>> >> >> +
> >>> >> >> spin_lock_irqsave(&v->arch.vgic.lock, flags);
> >>> >> >>
> >>> >> >> while ((i = find_next_bit((const unsigned long *)
> >>> >> >> &this_cpu(lr_mask),
> >>> >> >> @@ -821,12 +823,8 @@ void gic_inject(void)
> >>> >> >>
> >>> >> >> gic_restore_pending_irqs(current);
> >>> >> >>
> >>> >> >> -
> >>> >> >> if ( !list_empty(¤t->arch.vgic.lr_pending) &&
> >>> >> >> lr_all_full() )
> >>> >> >> GICH[GICH_HCR] |= GICH_HCR_UIE;
> >>> >> >> - else
> >>> >> >> - GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>> >> >> -
> >>> >> >> }
> >>> >> >>
> >>> >> >> static void do_sgi(struct cpu_user_regs *regs, int othercpu, enum
> >>> >> >> gic_sgi sgi)
> >>> >> >
> >>> >>
> >>> >> Heh - I don't see hangs with this patch :) But also I see that
> >>> >> maintenance interrupt doesn't occur (and no hang as result)
> >>> >> Stefano - is this expected?
> >>> >
> >>> > No maintenance interrupts at all? That's strange. You should be
> >>> > receiving them when LRs are full and you still have interrupts pending
> >>> > to be added to them.
> >>> >
> >>> > You could add another printk here to see if you should be receiving
> >>> > them:
> >>> >
> >>> > if ( !list_empty(¤t->arch.vgic.lr_pending) && lr_all_full() )
> >>> > + {
> >>> > + gdprintk(XENLOG_DEBUG, "requesting maintenance interrupt\n");
> >>> > GICH[GICH_HCR] |= GICH_HCR_UIE;
> >>> > - else
> >>> > - GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>> > -
> >>> > + }
> >>> > }
> >>> >
> >>>
> >>> Requested properly:
> >>>
> >>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>>
> >>> But does not occur
> >>
> >> OK, let's see what's going on then by printing the irq number of the
> >> maintenance interrupt:
> >>
> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> >> index 4d2a92d..fed3167 100644
> >> --- a/xen/arch/arm/gic.c
> >> +++ b/xen/arch/arm/gic.c
> >> @@ -55,6 +55,7 @@ static struct {
> >> static DEFINE_PER_CPU(uint64_t, lr_mask);
> >>
> >> static uint8_t nr_lrs;
> >> +static bool uie_on;
> >> #define lr_all_full() (this_cpu(lr_mask) == ((1 << nr_lrs) - 1))
> >>
> >> /* The GIC mapping of CPU interfaces does not necessarily match the
> >> @@ -694,6 +695,7 @@ void gic_clear_lrs(struct vcpu *v)
> >> {
> >> int i = 0;
> >> unsigned long flags;
> >> + unsigned long hcr;
> >>
> >> /* The idle domain has no LRs to be cleared. Since gic_restore_state
> >> * doesn't write any LR registers for the idle domain they could be
> >> @@ -701,6 +703,13 @@ void gic_clear_lrs(struct vcpu *v)
> >> if ( is_idle_vcpu(v) )
> >> return;
> >>
> >> + hcr = GICH[GICH_HCR];
> >> + if ( hcr & GICH_HCR_UIE )
> >> + {
> >> + GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >> + uie_on = 1;
> >> + }
> >> +
> >> spin_lock_irqsave(&v->arch.vgic.lock, flags);
> >>
> >> while ((i = find_next_bit((const unsigned long *) &this_cpu(lr_mask),
> >> @@ -865,6 +873,11 @@ void gic_interrupt(struct cpu_user_regs *regs, int
> >> is_fiq)
> >> intack = GICC[GICC_IAR];
> >> irq = intack & GICC_IA_IRQ;
> >>
> >> + if ( uie_on )
> >> + {
> >> + uie_on = 0;
> >> + printk("received maintenance interrupt irq=%d\n", irq);
> >> + }
> >> if ( likely(irq >= 16 && irq < 1021) )
> >> {
> >> local_irq_enable();
> >
> >
> >
> > --
> >
> > Andrii Tseglytskyi | Embedded Dev
> > GlobalLogic
> > www.globallogic.com
>
>
>
> --
>
> Andrii Tseglytskyi | Embedded Dev
> GlobalLogic
> www.globallogic.com
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |