[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-ia64-devel] PATCH: slightly improve stability
Hi Dan, Yes, we also got a segmentation fault in 1 run out of 30. Could you please try this new patch? Thanks, -Anthony >-----Original Message----- >From: Magenheimer, Dan (HP Labs Fort Collins) [mailto:dan.magenheimer@xxxxxx] >Sent: 2006?4?28? 22:49 >To: Xu, Anthony; Tristan Gingold; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; >Williamson, Alex (Linux Kernel Dev) >Subject: RE: [Xen-ia64-devel] PATCH: slightly improve stability > >Hi Anthony -- > >I tried your patch overnight and still got a segmentation >fault in 1 run out of 50. I didn't try Tristan's patch yet, >so will try both at the same time next... perhaps there >are two different problems that show up as the segmentation >fault. > >Dan > >> -----Original Message----- >> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx] >> Sent: Thursday, April 27, 2006 9:19 PM >> To: Xu, Anthony; Tristan Gingold; >> xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; Magenheimer, Dan (HP Labs >> Fort Collins); Williamson, Alex (Linux Kernel Dev) >> Subject: RE: [Xen-ia64-devel] PATCH: slightly improve stability >> >> Hi Tristan, >> Could you please check whether this patch address RSE issue? >> >> Yes, Intel QA team is doing the test in the meantime. >> >> >> Thanks, >> -Anthony >> >> >-----Original Message----- >> >From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx >> >[mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On >> Behalf Of Xu, Anthony >> >Sent: 2006?4?28? 9:48 >> >To: Tristan Gingold; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; >> Magenheimer, Dan (HP >> >Labs Fort Collins); Alex Williamson >> >Subject: RE: [Xen-ia64-devel] PATCH: slightly improve stability >> > >> >>From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx >> >>[mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On >> Behalf Of Tristan >> >>Gingold >> >>Sent: 2006?4?27? 23:14 >> >>To: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; Magenheimer, Dan >> (HP Labs Fort >> >>Collins); Alex Williamson >> >>Subject: [Xen-ia64-devel] PATCH: slightly improve stability >> >> >> >>Hi, >> >> >> >>as reported earlier, this patch seems to improve stability: >> crashes are at >> >>least more coherent and maybe less frequent. >> >> >> >>RSE handling seems to have a bug: crahes are now due to >> either a bad value in >> >>a stacked register or a use of an invalid stacked register >> (although cfm >> >>seems correct in gdb!) >> > >> >I'm looking at this too, >> >Yes there is a bug about handle_lazy_cover. >> > >> >void ia64_do_page_fault (unsigned long address, unsigned >> long isr, struct >> >pt_regs *regs, unsigned long itir) >> >{ >> > unsigned long iip = regs->cr_iip, iha; >> > // FIXME should validate address here >> > unsigned long pteval; >> > unsigned long is_data = !((isr >> IA64_ISR_X_BIT) & 1UL); >> > IA64FAULT fault; >> > >> > if ((isr & IA64_ISR_IR) && handle_lazy_cover(current, >> isr, regs)) return; >> > >> >This code sequence is intended to handle following scenario. >> > >> >1. Guest executes br.ret, this may cause mandatory RSE load, >> and this load may >> >cause TLB miss. >> >2. VMM gets control, but VMM can't handle this TLB miss >> itself, then VMM injects >> >TLB miss to Guest TLB miss handler, when VMM executing "rfi" >> to jump to Guest >> >TLB miss handler, this TLB miss happens again. >> >3. At this time, interrupt_collection_enabled is 0, so >> handle_lazy_cover >> >executes "cover" on behalf of Guest, and return to Guest TLB >> miss handler again, >> >this time there is no TLB miss. >> > >> > >> >Following code sequence is in ia64_leave_kernel path with >> psr.ic and psr.i off. >> >When br.ret.dptk.many b0 is executed, there may be a >> mandatory load, thus >> >There may be a tlb miss, according to above description >> handle_lazy_cover >> >executes "cover" on behalf of Guest and return to Guest, >> this is no correct >> >in this scenario. >> > >> >I didn't find an easy way to fix this bug. >> > >> > >> > mov loc6=0 >> > mov loc7=0 >> >(pRecurse) br.call.dptk.few b0=rse_clear_invalid >> > ;; >> > mov loc8=0 >> > mov loc9=0 >> > cmp.ne pReturn,p0=r0,in1 // if recursion count >> != 0, we need to do a >> >br.ret >> > mov loc10=0 >> > mov loc11=0 >> >(pReturn) br.ret.dptk.many b0 >> >#endif /* !CONFIG_ITANIUM */ >> ># undef pRecurse >> ># undef pReturn >> > ;; >> > alloc r17=ar.pfs,0,0,0,0 // drop current register frame >> > ;; >> > loadrs >> > >> >Thanks, >> >Anthony >> > >> > >> >> >> >>Tested by doing many linux kernel compilation in SMP domU (> 100). >> >> >> >>Tristan. >> > >> >_______________________________________________ >> >Xen-ia64-devel mailing list >> >Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx >> >http://lists.xensource.com/xen-ia64-devel >> Attachment:
RSE_incomplete_cfm.patch _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |