[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-ia64-devel] [PATCH] Fix severe vtlb BUG which made domU crashrandomly


  • To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, <xen-ia64-devel@xxxxxxxxxxxxxxxxxxx>
  • From: "Magenheimer, Dan (HP Labs Fort Collins)" <dan.magenheimer@xxxxxx>
  • Date: Thu, 15 Sep 2005 10:56:03 -0700
  • Delivery-date: Thu, 15 Sep 2005 17:53:43 +0000
  • List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
  • Thread-index: AcW53y81Gq7TyxJfTdOCLA6YkyxvfwAPmF4Q
  • Thread-topic: [Xen-ia64-devel] [PATCH] Fix severe vtlb BUG which made domU crashrandomly

Excellent!  BTW, this appears to have fixed/moved the
"hg clone" segfault I was seeing.  Though "hg clone" now works,
"hg update" segfaulted on me at least once, though
this could be unrelated.

The "DTLB MATCH" debug message -- which used to be very
rare -- is now quite frequent, which supports your theory
that the vtlb will get more hits.

Related topic: In another message you said that you were
having some hg problems... was that on xenlinux or linux? 

> -----Original Message-----
> From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx 
> [mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf 
> Of Tian, Kevin
> Sent: Thursday, September 15, 2005 4:21 AM
> To: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
> Subject: [Xen-ia64-devel] [PATCH] Fix severe vtlb BUG which 
> made domU crashrandomly
> 
> When I was debugging domU, machine crash often occurred when domU is
> detecting IDE disk (need disable later with different config file for
> xen0/xenU) which preventing further progress. After some 
> investigation,
> the root cause is that page fault handler invokes 
> vcpu_itc_no_srlz with
> 5th param (guest pte) to be (-1UL). However there's no detection upon
> such deliberate value in vcpu_itc_no_srlz, which thus updates (-1UL)
> into copy of guest pte (vcpu->arch.dtlb_pte). 
> 
> However match_dtlb only checks machine pte (vcpu->arch.dtlb) which is
> still valid. Once this entry is hit in next page fault handler, (-1UL)
> will be returned as the guest pte value and later translation for
> machine pfn failed.
> 
> I think this can explain why Dan saw dom0 unstable after my previous
> vtlb fix. Then after Dan pushed out a temporary solution for dom0, the
> issue was left only to domU. I'm not sure why the phenomenon 
> related to
> this reason didn't occur to me previously, but big merges in 
> the middle
> may be the cause. John, maybe you can make a try since you saw domU
> crash for a long time (But has to be upon your previous Rev). ;-)
> 
> A side effect of this change is that vtlb can be hit more frequently
> than before. I didn't remove the print line after match_dtlb, and you
> can see it occurring more. The possible reason is that guest 
> svhpt entry
> will live longer now.
> 
> (BTW, this fix doesn't bring domU up, and blkfront still failed to
> connect. But it can provide a pretty stable/reproducible env now.)
> 
> Thanks,
> Kevin
> 

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.