[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-ia64-devel] [PATCH] fixed some bugs to make xen0 more stable


  • To: "Xu, Anthony" <anthony.xu@xxxxxxxxx>
  • From: "Magenheimer, Dan (HP Labs Fort Collins)" <dan.magenheimer@xxxxxx>
  • Date: Fri, 14 Oct 2005 12:57:13 -0700
  • Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Fri, 14 Oct 2005 19:54:27 +0000
  • List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
  • Thread-index: AcXP2HRI83rf2/ODRBuc5moDrgGuwgAGusrwAAqDThAABKMkoAAL0j3AAAZQlCAAHDZ80AADyA3w
  • Thread-topic: [Xen-ia64-devel] [PATCH] fixed some bugs to make xen0 more stable

After 12 successful builds, I got two in a row that failed
with a segmentation fault. :-(  Since the heartbeat is now turned off,
I can see that Xen is giving a clue as to what the problem is.
When both faults happened, even though the failure shows up at
a different place in the build I got an identical non-fatal message:

vcpu_translate: bad address: 0000000005a65a69, viip=2000000000163750,
 vipsr=00001213081a6018,  iip=20000000001d6180, ipsr=0000101308126018

I wonder what that address is... I have seen it before.
Perhaps it is predicates?

I won't have much of an opportunity to look further for this
for awhile so wanted to post what I've seen to date.

Dan

> -----Original Message-----
> From: Magenheimer, Dan (HP Labs Fort Collins) 
> Sent: Friday, October 14, 2005 12:05 PM
> To: Xu, Anthony
> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: [Xen-ia64-devel] [PATCH] fixed some bugs to make 
> xen0 more stable
> 
> There were definitely some bugs involving the itir in
> vcpu_translate.  In the process of fixing them,
> I was over-aggressive in cleaning up some code.
> When I backed out some of that cleanup, everything
> seems to be fine.  (I still get a couple of NaT fault
> messages every compile, but they seem to be harmless.)
> 
> The segfault problem occurs rarely enough that I don't
> know if I fixed it but have run 9 builds without
> a problem now and I definitely fixed some itir
> problems, so I have committed the changeset to
> xen-ia64-unstable.
> 
> > -----Original Message-----
> > From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx 
> > [mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf 
> > Of Magenheimer, Dan (HP Labs Fort Collins)
> > Sent: Thursday, October 13, 2005 10:37 PM
> > To: Xu, Anthony
> > Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
> > Subject: RE: [Xen-ia64-devel] [PATCH] fixed some bugs to make 
> > xen0 more stable
> > 
> > In my testing, I now saw what appeared to be an infinite loop
> > of NaT faults.  The "ps" command showed a "sh" with several
> > minutes of CPU time while the console window scrolled continually
> > with "NaT fault... attempting to handle as privop".  This may
> > or may not be a side effect of the patch I am testing.  I'll
> > see if it shows up again (but am logging off now until the
> > morning).
> > 
> > > -----Original Message-----
> > > From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx] 
> > > Sent: Thursday, October 13, 2005 8:41 PM
> > > To: Magenheimer, Dan (HP Labs Fort Collins)
> > > Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
> > > Subject: RE: [Xen-ia64-devel] [PATCH] fixed some bugs to make 
> > > xen0 more stable
> > > 
> > > We shouldn't see any Nat faults. And I didn't see Nat faults 
> > > on my test.
> > > 
> > > 
> > > >-----Original Message-----
> > > >From: Magenheimer, Dan (HP Labs Fort Collins) 
> > > [mailto:dan.magenheimer@xxxxxx]
> > > >Sent: 2005å10æ14æ 3:59
> > > >To: Xu, Anthony
> > > >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
> > > >Subject: RE: [Xen-ia64-devel] [PATCH] fixed some bugs to 
> > > make xen0 more stable
> > > >
> > > >> However, my testing is not going well so far.  I had just
> > > >> completed compiling Linux 15 times on tip (with Tristan's
> > > >> SMP patch) without any problems, but 2 of 5 runs so far with
> > > >> this new patch failed with segment faults.
> > > >
> > > >Followed by six successful builds :-%
> > > >
> > > >I'm going to assume this is a random occurrence of a bug
> > > >unrelated to your patch that happens to occur only every
> > > >few hours or so and will commit your patch.
> > > >
> > > >By the way, I am now seeing two NaT faults per Linux build
> > > >that are printing "attempting to handle as privop."
> > > >I assume your fix exposed these but the messages are
> > > >harmless?
> > > >
> > > >Dan
> > > 
> > 
> 
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.