[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-API] XCP Storage resiliency


  • To: xen-api@xxxxxxxxxxxxx
  • From: George Shuklin <george.shuklin@xxxxxxxxx>
  • Date: Fri, 21 Jun 2013 12:16:04 +0400
  • Delivery-date: Fri, 21 Jun 2013 08:16:19 +0000
  • List-id: User and development list for XCP and XAPI <xen-api.lists.xen.org>

I'm talking not about dom0, mostly, but domU kernel. If IO takes more than 120 seconds, it will processed as 'io timeout'. And this timeout is hardcoded (no /sys|/proc variables).

If you getting IO timeout in less than 2 minutes - that different question.

19.06.2013 00:48, Nathan March ÐÐÑÐÑ:
On 6/18/2013 1:08 PM, George Shuklin wrote:
AFAIK there is a hard timeout in linux kernels for stalled IO. 120 seconds, if I remember correctly.

And there is no way to prevent those errors without playing with guest kernels and dom0 kernel.

Strange thing is that things work beautifully on our xensource (using blkback) but fail on xcp (using tap), so seems to be something specific to tap that causes a failure to be returned to the VM has an IO error. The VM's are identical between the two systems (including kernel).

I tried looking into adjusting the IO timeouts on the guest, but unfortunately I don't have a timeout file in /sys/block/xvda/device/, haven't tried tweaking things on the host side though.


18.06.2013 03:47, Nathan March ÐÐÑÐÑ:
Hi,

Have been playing around with XCP and I'm looking to achieve high resiliency in the situation of a backend NFS SR failure. I've got an existing cluster based around xen open source and I've successfully tested NFS outages of as long as 1 hour, during which IO/VMs simply hang and upon recovery typically come back gracefully.

Unfortunately when I try pulling access from an xcp host, it pretty quickly (within a minute or two) starts returning I/O errors:

[871449.552331] end_request: I/O error, dev tda, sector 26384232
[871457.632417] end_request: I/O error, dev tda, sector 19276360

Likewise in the VM I'm seeing I/O errors being returned:

[ 4657.253261] end_request: I/O error, dev xvda, sector 19276432
[ 4657.253264] Buffer I/O error on device xvda4, logical block 50002
[ 4657.253268] lost page write due to I/O error on xvda4
[ 4657.253275] end_request: I/O error, dev xvda, sector 19276440
[ 4657.253283] end_request: I/O error, dev xvda, sector 19276448

I've tried mounting the NFS SR with -o hard instead of the default soft but it seemed to have no significant impact.

Is what I'm looking to do possible with XCP? I'd simply like the vms to hang while waiting for IO or optionally have the HV pause them (I think vmware does this). Googling has failed me so any useful tips would be appreciated, thanks!

- Nathan

_______________________________________________
Xen-api mailing list
Xen-api@xxxxxxxxxxxxx
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api


_______________________________________________
Xen-api mailing list
Xen-api@xxxxxxxxxxxxx
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api




_______________________________________________
Xen-api mailing list
Xen-api@xxxxxxxxxxxxx
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.