[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xend crashes, how to debug?


  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: Dennis Krul <dweazle@xxxxxxxxx>
  • Date: Tue, 1 Dec 2009 16:27:20 +0100
  • Delivery-date: Tue, 01 Dec 2009 07:27:47 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=thD/1d2j6IqGQ+F1+WTbMyYsvgf3JnbsrYxGZLYyY8+iJ84s7DaL8MloClsSIPRhy/ Ea2+YjkIN2jIrhtXE+355MuBRZy8cW260M8MINUzIPKrspvV/H8s/r0ADbgLoZEdaxqZ JHApg/DVii75zHRa3twKtLq957r7jnBTOjiFU=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On Mon, Nov 30, 2009 at 10:02 PM, Dave Scott <Dave.Scott@xxxxxxxxxxxxx> wrote:

The observation that speeding up xenstore reduces the frequency of crashes is interesting. Perhaps the failure happens when a concurrent transaction causes an abort? Maybe you could provoke it by running 'xm create' in a loop while also writing somewhere in xenstore? IIRC (although I could be mistaken) the standard C xenstore considers all concurrent transactions to be conflicting even if they operate on disjoint parts of the tree so provoking an abort would be easy.

Hey Dave,

Thanks for responding! This actually sounds quite plausible.
 
Caveats:
1. We don't have an 'xm'... instead there's a CLI called 'xe' which can do almost everything the API can do but the syntax is different to 'xm'. You'd either have to port your scripts ('xe vm-start' rather than 'xm create'?) or write some kind of wrapper.

That shouldn't be too difficult :) 

The reason we rewrote xenstored was because we used xenstore to report periodic guest performance stats to dom0. By doing this we accidentally created a horrible scalability bottleneck where, somewhere around 30 or 40 guests, every transaction aborted and the system livelocked. The new xenstored is smart enough to realize that these separate transactions are not conflicting and can be committed together.

We also have a couple of scripts that periodically collect statistics from the xenstore. We haven't seen any livelocks, but perhaps the xend crashes are caused by the same limitation. The xend crashes don't seem to happen until we actually have some (20+?) domU's running.

I'd like to try to get the ocaml toolchain (xend/xenstore/xe) working with the community version of the hypervisor (preferably 3.3.2) and our custom dom0 kernel. Do you think I have any chance of succeeding? Or are they really incompatible and need heavy patching to make it work? (In the latter case I'll just try the XenServer stack instead.)

Final question. There also seems to be an opensource version of XenServer published on the citrix site here:
http://www.citrix.com/lang/English/lp/lp_1688623.asp

Are those the same iso's as the ones on the xen site? (at http://www.xen.org/products/cloud_source.html)

Thanks again!

-- Dennis Krul <dweazle@xxxxxxxxx>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.