[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.5 random freeze question



No, that's for requesting a maintenance interrupt for a specific irq
when it is EOI'ed by the guest.

In our case we are requesting maintenance interrupts via UIE: a single
global maintenance interrupt when most LRs become free.

On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> BTW - shouldn't this flag GICH_LR_MAINTENANCE_IRQ be set after
> maintenance interrupt requesting ?
> 
> On Wed, Nov 19, 2014 at 6:32 PM, Andrii Tseglytskyi
> <andrii.tseglytskyi@xxxxxxxxxxxxxxx> wrote:
> > Gic dump during interrupt requesting:
> >
> > (XEN) GICH_LRs (vcpu 0) mask=f
> > (XEN)    HW_LR[0]=3a00001f
> > (XEN)    HW_LR[1]=9a015856
> > (XEN)    HW_LR[2]=1a00001b
> > (XEN)    HW_LR[3]=9a00e439
> > (XEN) Inflight irq=31 lr=0
> > (XEN) Inflight irq=86 lr=1
> > (XEN) Inflight irq=27 lr=2
> > (XEN) Inflight irq=57 lr=3
> > (XEN) Inflight irq=2 lr=255
> > (XEN) Pending irq=2
> >
> > On Wed, Nov 19, 2014 at 6:29 PM, Andrii Tseglytskyi
> > <andrii.tseglytskyi@xxxxxxxxxxxxxxx> wrote:
> >> On Wed, Nov 19, 2014 at 6:13 PM, Stefano Stabellini
> >> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >>> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>>> On Wed, Nov 19, 2014 at 6:01 PM, Andrii Tseglytskyi
> >>>> <andrii.tseglytskyi@xxxxxxxxxxxxxxx> wrote:
> >>>> > On Wed, Nov 19, 2014 at 5:41 PM, Stefano Stabellini
> >>>> > <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >>>> >> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>>> >>> Hi Stefano,
> >>>> >>>
> >>>> >>> On Wed, Nov 19, 2014 at 4:52 PM, Stefano Stabellini
> >>>> >>> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >>>> >>> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>>> >>> >> Hi Stefano,
> >>>> >>> >>
> >>>> >>> >> > >      if ( !list_empty(&current->arch.vgic.lr_pending) && 
> >>>> >>> >> > > lr_all_full() )
> >>>> >>> >> > > -        GICH[GICH_HCR] |= GICH_HCR_UIE;
> >>>> >>> >> > > +        GICH[GICH_HCR] |= GICH_HCR_NPIE;
> >>>> >>> >> > >      else
> >>>> >>> >> > > -        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>>> >>> >> > > +        GICH[GICH_HCR] &= ~GICH_HCR_NPIE;
> >>>> >>> >> > >
> >>>> >>> >> > >  }
> >>>> >>> >> >
> >>>> >>> >> > Yes, exactly
> >>>> >>> >>
> >>>> >>> >> I tried, hang still occurs with this change
> >>>> >>> >
> >>>> >>> > We need to figure out why during the hang you still have all the 
> >>>> >>> > LRs
> >>>> >>> > busy even if you are getting maintenance interrupts that should 
> >>>> >>> > cause
> >>>> >>> > them to be cleared.
> >>>> >>> >
> >>>> >>>
> >>>> >>> I see that I have free LRs during maintenance interrupt
> >>>> >>>
> >>>> >>> (XEN) gic.c:871:d0v0 maintenance interrupt
> >>>> >>> (XEN) GICH_LRs (vcpu 0) mask=0
> >>>> >>> (XEN)    HW_LR[0]=9a015856
> >>>> >>> (XEN)    HW_LR[1]=0
> >>>> >>> (XEN)    HW_LR[2]=0
> >>>> >>> (XEN)    HW_LR[3]=0
> >>>> >>> (XEN) Inflight irq=86 lr=0
> >>>> >>> (XEN) Inflight irq=2 lr=255
> >>>> >>> (XEN) Pending irq=2
> >>>> >>>
> >>>> >>> But I see that after I got hang - maintenance interrupts are 
> >>>> >>> generated
> >>>> >>> continuously. Platform continues printing the same log till reboot.
> >>>> >>
> >>>> >> Exactly the same log? As in the one above you just pasted?
> >>>> >> That is very very suspicious.
> >>>> >
> >>>> > Yes exactly the same log. And looks like it means that LRs are flushed
> >>>> > correctly.
> >>>> >
> >>>> >>
> >>>> >> I am thinking that we are not handling GICH_HCR_UIE correctly and
> >>>> >> something we do in Xen, maybe writing to an LR register, might 
> >>>> >> trigger a
> >>>> >> new maintenance interrupt immediately causing an infinite loop.
> >>>> >>
> >>>> >
> >>>> > Yes, this is what I'm thinking about. Taking in account all collected
> >>>> > debug info it looks like once LRs are overloaded with SGIs -
> >>>> > maintenance interrupt occurs.
> >>>> > And then it is not handled properly, and occurs again and again - so
> >>>> > platform hangs inside its handler.
> >>>> >
> >>>> >> Could you please try this patch? It disable GICH_HCR_UIE immediately 
> >>>> >> on
> >>>> >> hypervisor entry.
> >>>> >>
> >>>> >
> >>>> > Now trying.
> >>>> >
> >>>> >>
> >>>> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> >>>> >> index 4d2a92d..6ae8dc4 100644
> >>>> >> --- a/xen/arch/arm/gic.c
> >>>> >> +++ b/xen/arch/arm/gic.c
> >>>> >> @@ -701,6 +701,8 @@ void gic_clear_lrs(struct vcpu *v)
> >>>> >>      if ( is_idle_vcpu(v) )
> >>>> >>          return;
> >>>> >>
> >>>> >> +    GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>>> >> +
> >>>> >>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
> >>>> >>
> >>>> >>      while ((i = find_next_bit((const unsigned long *) 
> >>>> >> &this_cpu(lr_mask),
> >>>> >> @@ -821,12 +823,8 @@ void gic_inject(void)
> >>>> >>
> >>>> >>      gic_restore_pending_irqs(current);
> >>>> >>
> >>>> >> -
> >>>> >>      if ( !list_empty(&current->arch.vgic.lr_pending) && 
> >>>> >> lr_all_full() )
> >>>> >>          GICH[GICH_HCR] |= GICH_HCR_UIE;
> >>>> >> -    else
> >>>> >> -        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>>> >> -
> >>>> >>  }
> >>>> >>
> >>>> >>  static void do_sgi(struct cpu_user_regs *regs, int othercpu, enum 
> >>>> >> gic_sgi sgi)
> >>>> >
> >>>>
> >>>> Heh - I don't see hangs with this patch :) But also I see that
> >>>> maintenance interrupt doesn't occur (and no hang as result)
> >>>> Stefano - is this expected?
> >>>
> >>> No maintenance interrupts at all? That's strange. You should be
> >>> receiving them when LRs are full and you still have interrupts pending
> >>> to be added to them.
> >>>
> >>> You could add another printk here to see if you should be receiving
> >>> them:
> >>>
> >>>      if ( !list_empty(&current->arch.vgic.lr_pending) && lr_all_full() )
> >>> +    {
> >>> +        gdprintk(XENLOG_DEBUG, "requesting maintenance interrupt\n");
> >>>          GICH[GICH_HCR] |= GICH_HCR_UIE;
> >>> -    else
> >>> -        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>> -
> >>> +    }
> >>>  }
> >>>
> >>
> >> Requested properly:
> >>
> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>
> >> But does not occur
> >>
> >>
> >>>
> >>>> >
> >>>> >
> >>>> > --
> >>>> >
> >>>> > Andrii Tseglytskyi | Embedded Dev
> >>>> > GlobalLogic
> >>>> > www.globallogic.com
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> Andrii Tseglytskyi | Embedded Dev
> >>>> GlobalLogic
> >>>> www.globallogic.com
> >>>>
> >>
> >>
> >>
> >> --
> >>
> >> Andrii Tseglytskyi | Embedded Dev
> >> GlobalLogic
> >> www.globallogic.com
> >
> >
> >
> > --
> >
> > Andrii Tseglytskyi | Embedded Dev
> > GlobalLogic
> > www.globallogic.com
> 
> 
> 
> -- 
> 
> Andrii Tseglytskyi | Embedded Dev
> GlobalLogic
> www.globallogic.com
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.