[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-API] XCP Storage resiliency

To: xen-api@xxxxxxxxxxxxx
From: Nathan March <nathan@xxxxxx>
Date: Tue, 18 Jun 2013 13:48:14 -0700
Delivery-date: Tue, 18 Jun 2013 20:48:36 +0000
Domainkey-signature: a=rsa-sha1; c=nofws; d=gt.net; h=message-id:date :from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; q=dns; s=mail; b=UijcfR E16am1AuBXMf9tCLgvEaWO1bBOoZQ+zb9UrSQxxbv4h/hfHuC+xDQD37AQ6AXIgG dUFuSvpzlHEbSOXeogwafQ7w1LLMT3bSBELHsLydxdsGcJ4kvvv7hOxDgOY+rP+o 3JTj5ytVA7osZXULm0QtVfsCbTmcN8ixWXY7k=
List-id: User and development list for XCP and XAPI <xen-api.lists.xen.org>

On 6/18/2013 1:08 PM, George Shuklin wrote:

AFAIK there is a hard timeout in linux kernels for stalled IO. 120seconds, if I remember correctly.
And there is no way to prevent those errors without playing with guestkernels and dom0 kernel.

Strange thing is that things work beautifully on our xensource (usingblkback) but fail on xcp (using tap), so seems to be something specificto tap that causes a failure to be returned to the VM has an IO error.The VM's are identical between the two systems (including kernel).

I tried looking into adjusting the IO timeouts on the guest, butunfortunately I don't have a timeout file in /sys/block/xvda/device/,haven't tried tweaking things on the host side though.

18.06.2013 03:47, Nathan March ÐÐÑÐÑ:
Hi,
Have been playing around with XCP and I'm looking to achieve highresiliency in the situation of a backend NFS SR failure. I've got anexisting cluster based around xen open source and I've successfullytested NFS outages of as long as 1 hour, during which IO/VMs simplyhang and upon recovery typically come back gracefully.
Unfortunately when I try pulling access from an xcp host, it prettyquickly (within a minute or two) starts returning I/O errors:
[871449.552331] end_request: I/O error, dev tda, sector 26384232
[871457.632417] end_request: I/O error, dev tda, sector 19276360

Likewise in the VM I'm seeing I/O errors being returned:

[ 4657.253261] end_request: I/O error, dev xvda, sector 19276432
[ 4657.253264] Buffer I/O error on device xvda4, logical block 50002
[ 4657.253268] lost page write due to I/O error on xvda4
[ 4657.253275] end_request: I/O error, dev xvda, sector 19276440
[ 4657.253283] end_request: I/O error, dev xvda, sector 19276448
I've tried mounting the NFS SR with -o hard instead of the defaultsoft but it seemed to have no significant impact.
Is what I'm looking to do possible with XCP? I'd simply like the vmsto hang while waiting for IO or optionally have the HV pause them (Ithink vmware does this). Googling has failed me so any useful tipswould be appreciated, thanks!
- Nathan

_______________________________________________
Xen-api mailing list
Xen-api@xxxxxxxxxxxxx
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api
_______________________________________________
Xen-api mailing list
Xen-api@xxxxxxxxxxxxx
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api



--
Nathan March<nathan@xxxxxx>
Gossamer Threads Inc. http://www.gossamer-threads.com/
Tel: (604) 687-5804 Fax: (604) 687-5806


_______________________________________________
Xen-api mailing list
Xen-api@xxxxxxxxxxxxx
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api

Follow-Ups:
- Re: [Xen-API] XCP Storage resiliency
  - From: George Shuklin

References:
- [Xen-API] XCP Storage resiliency
  - From: Nathan March
- Re: [Xen-API] XCP Storage resiliency
  - From: George Shuklin

Prev by Date: Re: [Xen-API] XCP Storage resiliency
Next by Date: [Xen-API] 5th Xen Test Day for 4.3 is Wed, June 19th! [Was: Re: [Xen-devel] 4th Xen Test Day for 4.3 is Tomorrow!]
Previous by thread: Re: [Xen-API] XCP Storage resiliency
Next by thread: Re: [Xen-API] XCP Storage resiliency
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.