[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-ia64-devel] [PATCH] fixed some bugs to make xen0 more stable


  • To: "Magenheimer, Dan \(HP Labs Fort Collins\)" <dan.magenheimer@xxxxxx>
  • From: "Xu, Anthony" <anthony.xu@xxxxxxxxx>
  • Date: Tue, 18 Oct 2005 12:39:12 +0800
  • Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 18 Oct 2005 04:37:08 +0000
  • List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
  • Thread-index: AcXP2HRI83rf2/ODRBuc5moDrgGuwgAGusrwAAqDThAABKMkoAAL0j3AAAZQlCAAHDZ80AADyA3wAHMgqiAANiI6oA==
  • Thread-topic: [Xen-ia64-devel] [PATCH] fixed some bugs to make xen0 more stable

Yes, I need wait very long to trigger this, the build process is very slow on 
my machine. Can we leave it alone, and revisit it later?

>-----Original Message-----
>From: Magenheimer, Dan (HP Labs Fort Collins) [mailto:dan.magenheimer@xxxxxx]
>Sent: 2005年10月17日 10:49
>To: Xu, Anthony
>Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>Subject: RE: [Xen-ia64-devel] [PATCH] fixed some bugs to make xen0 more stable
>
>I ran tests all weekend long.  59 out of 60 builds were
>successful.  One failed, with the same message as below.
>At least it is reproducible... if you wait long enough :-(
>
>> -----Original Message-----
>> From: Magenheimer, Dan (HP Labs Fort Collins)
>> Sent: Friday, October 14, 2005 1:57 PM
>> To: 'Xu, Anthony'
>> Cc: 'xen-ia64-devel@xxxxxxxxxxxxxxxxxxx'
>> Subject: RE: [Xen-ia64-devel] [PATCH] fixed some bugs to make
>> xen0 more stable
>>
>> After 12 successful builds, I got two in a row that failed
>> with a segmentation fault. :-(  Since the heartbeat is now turned off,
>> I can see that Xen is giving a clue as to what the problem is.
>> When both faults happened, even though the failure shows up at
>> a different place in the build I got an identical non-fatal message:
>>
>> vcpu_translate: bad address: 0000000005a65a69, viip=2000000000163750,
>>  vipsr=00001213081a6018,  iip=20000000001d6180, ipsr=0000101308126018
>>
>> I wonder what that address is... I have seen it before.
>> Perhaps it is predicates?
>>
>> I won't have much of an opportunity to look further for this
>> for awhile so wanted to post what I've seen to date.
>>
>> Dan
>>
>> > -----Original Message-----
>> > From: Magenheimer, Dan (HP Labs Fort Collins)
>> > Sent: Friday, October 14, 2005 12:05 PM
>> > To: Xu, Anthony
>> > Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> > Subject: RE: [Xen-ia64-devel] [PATCH] fixed some bugs to make
>> > xen0 more stable
>> >
>> > There were definitely some bugs involving the itir in
>> > vcpu_translate.  In the process of fixing them,
>> > I was over-aggressive in cleaning up some code.
>> > When I backed out some of that cleanup, everything
>> > seems to be fine.  (I still get a couple of NaT fault
>> > messages every compile, but they seem to be harmless.)
>> >
>> > The segfault problem occurs rarely enough that I don't
>> > know if I fixed it but have run 9 builds without
>> > a problem now and I definitely fixed some itir
>> > problems, so I have committed the changeset to
>> > xen-ia64-unstable.
>> >
>> > > -----Original Message-----
>> > > From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
>> > > [mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf
>> > > Of Magenheimer, Dan (HP Labs Fort Collins)
>> > > Sent: Thursday, October 13, 2005 10:37 PM
>> > > To: Xu, Anthony
>> > > Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> > > Subject: RE: [Xen-ia64-devel] [PATCH] fixed some bugs to make
>> > > xen0 more stable
>> > >
>> > > In my testing, I now saw what appeared to be an infinite loop
>> > > of NaT faults.  The "ps" command showed a "sh" with several
>> > > minutes of CPU time while the console window scrolled continually
>> > > with "NaT fault... attempting to handle as privop".  This may
>> > > or may not be a side effect of the patch I am testing.  I'll
>> > > see if it shows up again (but am logging off now until the
>> > > morning).
>> > >
>> > > > -----Original Message-----
>> > > > From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx]
>> > > > Sent: Thursday, October 13, 2005 8:41 PM
>> > > > To: Magenheimer, Dan (HP Labs Fort Collins)
>> > > > Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> > > > Subject: RE: [Xen-ia64-devel] [PATCH] fixed some bugs to make
>> > > > xen0 more stable
>> > > >
>> > > > We shouldn't see any Nat faults. And I didn't see Nat faults
>> > > > on my test.
>> > > >
>> > > >
>> > > > >-----Original Message-----
>> > > > >From: Magenheimer, Dan (HP Labs Fort Collins)
>> > > > [mailto:dan.magenheimer@xxxxxx]
>> > > > >Sent: 2005年10月14日 3:59
>> > > > >To: Xu, Anthony
>> > > > >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> > > > >Subject: RE: [Xen-ia64-devel] [PATCH] fixed some bugs to
>> > > > make xen0 more stable
>> > > > >
>> > > > >> However, my testing is not going well so far.  I had just
>> > > > >> completed compiling Linux 15 times on tip (with Tristan's
>> > > > >> SMP patch) without any problems, but 2 of 5 runs so far with
>> > > > >> this new patch failed with segment faults.
>> > > > >
>> > > > >Followed by six successful builds :-%
>> > > > >
>> > > > >I'm going to assume this is a random occurrence of a bug
>> > > > >unrelated to your patch that happens to occur only every
>> > > > >few hours or so and will commit your patch.
>> > > > >
>> > > > >By the way, I am now seeing two NaT faults per Linux build
>> > > > >that are printing "attempting to handle as privop."
>> > > > >I assume your fix exposed these but the messages are
>> > > > >harmless?
>> > > > >
>> > > > >Dan
>> > > >
>> > >
>> >
>>

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.