[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-ia64-devel] RE: [PATCH] Patch to make latest hg multi-domainback to work
Hi, Dan, Could you elaborate more how your latest patch works differently and fix the potential issue? - *pteval = vcpu->arch.dtlb_pte; + if (vcpu->domain==dom0 && !in_tpa) *pteval = trp->page_flags; + else *pteval = vcpu->arch.dtlb_pte; + printf("DTLB MATCH... NEW, DOM%s, %s\n", vcpu->domain==dom0? + "0":"U", in_tpa?"vcpu_tpa":"ia64_do_page_fault"); The new limitation seems only for dom0, while dom0 has exactly same guest physical address as machine address. Based upon this assumption, trp->page_flags actually equals to guest pte (vcpu->arch.dtlb_pte)? So I'm not sure about the trick here behind. Thanks, Kevin >-----Original Message----- >From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx >[mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Tian, Kevin >Sent: 2005年9月8日 17:16 >To: Magenheimer, Dan (HP Labs Fort Collins); Byrne, John (HP Labs) >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx >Subject: [Xen-ia64-devel] RE: [PATCH] Patch to make latest hg multi-domainback >to >work > >Still work for me. > >Thanks, >Kevin > >>-----Original Message----- >>From: Magenheimer, Dan (HP Labs Fort Collins) >[mailto:dan.magenheimer@xxxxxx] >>Sent: 2005年9月8日 4:57 >>To: Tian, Kevin; Byrne, John (HP Labs) >>Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx >>Subject: RE: [PATCH] Patch to make latest hg multi-domain back to work >> >>It appears that the patch below has created some instability >>in domain0. I regularly see a crash now in domain0 when >>compiling linux. I changed back to the old code and the >>crash seems to go away. Since it is unpredictable, I >>changed back to the new code AND added printfs around >>the new code in vcpu_translate and domain0 fails immediately after >>the printf (but ONLY when it is called from ia64_do_page_fault... >>its OK when called from vcpu_tpa). >> >>The attached patch returns stability to the system. It >>is definitely not a final patch (for example it's not >>SMP-safe), but I thought I would >>post it if anybody is trying to get some work done and >>domain0 keeps crashing intermittently. >> >>Kevin, John, I still haven't succesfully reproduced your >>multi-domain success, so please try this patch with >>the second domain. >> >>Thanks, >>Dan >> >>> -----Original Message----- >>> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx] >>> Sent: Friday, September 02, 2005 8:18 AM >>> To: Magenheimer, Dan (HP Labs Fort Collins); Byrne, John (HP Labs) >>> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx >>> Subject: [PATCH] Patch to make latest hg multi-domain back to work >>> >>> I saw some intermittent/weird behavior on latest xen-ia64-unstable.hg >>> (Rev 6461), where sometimes I can login into xenU shell, sometimes >>> pending after "Mounting root fs...", and even sometimes the >>> whole system >>> is broken as following: >>> >>> (XEN) ia64_fault: General Exception: IA-64 Reserved >>> Register/Field fault >>> (data access): reflecting >>> (XEN) $$$$$ PANIC in domain 1 (k6=f000000007fd8000): psr.ic off, >>> delivering >>> fault=5300,ipsr=0000121208026010,iip=a00000010000cd00,ifa=f000 >>> 000007fdfd >>> 60,isr=00000a0c00000004,PSCB.iip*** ADD REGISTER DUMP HERE >>> FOR DEBUGGING >>> (XEN) BUG at domain.c:311 >>> (XEN) priv_emulate: priv_handle_op fails, isr=0000000000000000 >>> (XEN) >>> >>> Finally I found the root cause is that match_dtlb should return guest >>> pte instead of machine pte, because translate_machine_pte will be >>> invoked always after vcpu_translate. Translate_machine_pte assumes to >>> accept a guest pte and will walk 3 level tables to get machine frame >>> number. Why does it happen so scare? >>> - For xen0, guest pfn == machine pfn, so nothing happen >>> - For xenU, currently there's only one vtlb entry to cache >>> latest inserted TC entry. Say current vtlb entry for VA1 has been >>> inserted into machine TLB. Normally there'll be many itc issued before >>> machine TC for VA1 is purged. Those insertion will change single vtlb >>> entry. So in 99.99% case, once guest va is purged out of machine >>> TLB/vhpt and trigger TLB miss again, match_tlb will fail. >>> >>> But there's also corner case where vtlb entry has not been updated but >>> the machine TC entry for VA1 has been purged. In this case, >>> if trying to >>> access that VA1 immediately, match_dtlb will return true and then >>> problematic code becomes the murderer. >>> >>> For example, sometimes I saw: >>> (XEN) translate_domain_pte: bad mpa=000000007f170080 (> >>> 0000000010004000),vadr=5fffff0000000080,pteval=000000007f17056 >>> 1,itir=000 >>> 0000000000038 >>> (XEN) lookup_domain_mpa: bad mpa 000000007f170080 (> 0000000010004000 >>> Above access happens when vcpu_translate tries to access guest SVHPT. >>> You can saw 0x7f170080 is actually machine pfn. When 0x7f170080 is >>> passed into translate_machine_pte, warning shows and it's >>> finally mapped >>> into machine pfn 0. (Maybe we can change such error condition >>> to panic, >>> instead of return incorrect pfn) >>> >>> Then things all went weird: >>> (XEN) translate_domain_pte: bad mpa=0000eef3f000e738 (> >>> 0000000010004000),vadr=4000000000042738,pteval=f000eef3f000eef >>> 3,itir=000 >>> 0000000026238 >>> (XEN) lookup_domain_mpa: bad mpa 0000eef3f000e738 (> 0000000010004000 >>> >>> And finally GP fault happens. This error has actually hidden >>> behind for >>> a long time, but seldom triggered. >>> >>> John, please make a test on your side with all the patches I sent out >>> today (Including the max_page one). I believe we can call it >>> an end now. >>> ;-) >>> >>> BTW, Dan, there's two heads on current xen-ia64-unstable.hg. >>> Please do a >>> merge. >>> >>> Signed-off-by Kevin Tian <Kevin.tian@xxxxxxxxx> >>> >>> diff -r 68d8a0a1aeb7 xen/arch/ia64/xen/vcpu.c >>> --- a/xen/arch/ia64/xen/vcpu.c Thu Sep 1 21:51:57 2005 >>> +++ b/xen/arch/ia64/xen/vcpu.c Fri Sep 2 21:30:01 2005 >>> @@ -1315,7 +1315,8 @@ >>> /* check 1-entry TLB */ >>> if ((trp = match_dtlb(vcpu,address))) { >>> dtlb_translate_count++; >>> - *pteval = trp->page_flags; >>> + //*pteval = trp->page_flags; >>> + *pteval = vcpu->arch.dtlb_pte; >>> *itir = trp->itir; >>> return IA64_NO_FAULT; >>> } >>> >>> Thanks, >>> Kevin >>> > >_______________________________________________ >Xen-ia64-devel mailing list >Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx >http://lists.xensource.com/xen-ia64-devel _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |