[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-ia64-devel] PATCH: slightly improve stability
Argh! After 103 successful linux compiles, two of the next 10 had a segfault. Restarting again with Anthony's updated patch (plus Tristan's stability patch)... > -----Original Message----- > From: Magenheimer, Dan (HP Labs Fort Collins) > Sent: Saturday, April 29, 2006 7:58 AM > To: 'Xu, Anthony'; Tristan Gingold; > xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; Williamson, Alex (Linux > Kernel Dev) > Subject: RE: [Xen-ia64-devel] PATCH: slightly improve stability > > Hi Anthony -- > > With both Tristan's stability patch and your earlier patch, > I have completed 103 linux compiles now with no segfaults > yet. Did you see your segfault with Tristan's patch > included? > > I'll continue running over the weekend with the bits I > have but if I see a segfault I will add in the additional > store in Xen entry (minstate.h) from your newer patch. > > Dan > > > -----Original Message----- > > From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx] > > Sent: Saturday, April 29, 2006 12:03 AM > > To: Magenheimer, Dan (HP Labs Fort Collins); Tristan Gingold; > > xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; Williamson, Alex (Linux > > Kernel Dev) > > Subject: RE: [Xen-ia64-devel] PATCH: slightly improve stability > > > > Hi Dan, > > > > Yes, we also got a segmentation fault in 1 run out of 30. > > > > Could you please try this new patch? > > > > Thanks, > > -Anthony > > > > >-----Original Message----- > > >From: Magenheimer, Dan (HP Labs Fort Collins) > > [mailto:dan.magenheimer@xxxxxx] > > >Sent: 2006å4æ28æ 22:49 > > >To: Xu, Anthony; Tristan Gingold; > xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; > > >Williamson, Alex (Linux Kernel Dev) > > >Subject: RE: [Xen-ia64-devel] PATCH: slightly improve stability > > > > > >Hi Anthony -- > > > > > >I tried your patch overnight and still got a segmentation > > >fault in 1 run out of 50. I didn't try Tristan's patch yet, > > >so will try both at the same time next... perhaps there > > >are two different problems that show up as the segmentation > > >fault. > > > > > >Dan > > > > > >> -----Original Message----- > > >> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx] > > >> Sent: Thursday, April 27, 2006 9:19 PM > > >> To: Xu, Anthony; Tristan Gingold; > > >> xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; Magenheimer, Dan (HP Labs > > >> Fort Collins); Williamson, Alex (Linux Kernel Dev) > > >> Subject: RE: [Xen-ia64-devel] PATCH: slightly improve stability > > >> > > >> Hi Tristan, > > >> Could you please check whether this patch address RSE issue? > > >> > > >> Yes, Intel QA team is doing the test in the meantime. > > >> > > >> > > >> Thanks, > > >> -Anthony > > >> > > >> >-----Original Message----- > > >> >From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx > > >> >[mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On > > >> Behalf Of Xu, Anthony > > >> >Sent: 2006?4?28? 9:48 > > >> >To: Tristan Gingold; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; > > >> Magenheimer, Dan (HP > > >> >Labs Fort Collins); Alex Williamson > > >> >Subject: RE: [Xen-ia64-devel] PATCH: slightly improve stability > > >> > > > >> >>From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx > > >> >>[mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On > > >> Behalf Of Tristan > > >> >>Gingold > > >> >>Sent: 2006?4?27? 23:14 > > >> >>To: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; Magenheimer, Dan > > >> (HP Labs Fort > > >> >>Collins); Alex Williamson > > >> >>Subject: [Xen-ia64-devel] PATCH: slightly improve stability > > >> >> > > >> >>Hi, > > >> >> > > >> >>as reported earlier, this patch seems to improve stability: > > >> crashes are at > > >> >>least more coherent and maybe less frequent. > > >> >> > > >> >>RSE handling seems to have a bug: crahes are now due to > > >> either a bad value in > > >> >>a stacked register or a use of an invalid stacked register > > >> (although cfm > > >> >>seems correct in gdb!) > > >> > > > >> >I'm looking at this too, > > >> >Yes there is a bug about handle_lazy_cover. > > >> > > > >> >void ia64_do_page_fault (unsigned long address, unsigned > > >> long isr, struct > > >> >pt_regs *regs, unsigned long itir) > > >> >{ > > >> > unsigned long iip = regs->cr_iip, iha; > > >> > // FIXME should validate address here > > >> > unsigned long pteval; > > >> > unsigned long is_data = !((isr >> > IA64_ISR_X_BIT) & 1UL); > > >> > IA64FAULT fault; > > >> > > > >> > if ((isr & IA64_ISR_IR) && handle_lazy_cover(current, > > >> isr, regs)) return; > > >> > > > >> >This code sequence is intended to handle following scenario. > > >> > > > >> >1. Guest executes br.ret, this may cause mandatory RSE load, > > >> and this load may > > >> >cause TLB miss. > > >> >2. VMM gets control, but VMM can't handle this TLB miss > > >> itself, then VMM injects > > >> >TLB miss to Guest TLB miss handler, when VMM executing "rfi" > > >> to jump to Guest > > >> >TLB miss handler, this TLB miss happens again. > > >> >3. At this time, interrupt_collection_enabled is 0, so > > >> handle_lazy_cover > > >> >executes "cover" on behalf of Guest, and return to Guest TLB > > >> miss handler again, > > >> >this time there is no TLB miss. > > >> > > > >> > > > >> >Following code sequence is in ia64_leave_kernel path with > > >> psr.ic and psr.i off. > > >> >When br.ret.dptk.many b0 is executed, there may be a > > >> mandatory load, thus > > >> >There may be a tlb miss, according to above description > > >> handle_lazy_cover > > >> >executes "cover" on behalf of Guest and return to Guest, > > >> this is no correct > > >> >in this scenario. > > >> > > > >> >I didn't find an easy way to fix this bug. > > >> > > > >> > > > >> > mov loc6=0 > > >> > mov loc7=0 > > >> >(pRecurse) br.call.dptk.few b0=rse_clear_invalid > > >> > ;; > > >> > mov loc8=0 > > >> > mov loc9=0 > > >> > cmp.ne pReturn,p0=r0,in1 // if recursion count > > >> != 0, we need to do a > > >> >br.ret > > >> > mov loc10=0 > > >> > mov loc11=0 > > >> >(pReturn) br.ret.dptk.many b0 > > >> >#endif /* !CONFIG_ITANIUM */ > > >> ># undef pRecurse > > >> ># undef pReturn > > >> > ;; > > >> > alloc r17=ar.pfs,0,0,0,0 // drop current > register frame > > >> > ;; > > >> > loadrs > > >> > > > >> >Thanks, > > >> >Anthony > > >> > > > >> > > > >> >> > > >> >>Tested by doing many linux kernel compilation in SMP > > domU (> 100). > > >> >> > > >> >>Tristan. > > >> > > > >> >_______________________________________________ > > >> >Xen-ia64-devel mailing list > > >> >Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx > > >> >http://lists.xensource.com/xen-ia64-devel > > >> > > > _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |