[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] driver domain crash and reconnect handling

On 24/01/13 14:06, George Shuklin wrote:
24.01.2013 17:25, Paul Durrant ÐÐÑÐÑ:

Some notes about guest suspend during IO.

I tested that way for storage reboot (pause all domains, reboot ISCSI storage
and resume every domain). If pause is short (less that 2 minutes), guest can
survive. If pause is longer than 2 minutes, guests in state of waiting for io
completion, detects IO timeout after resuming  and cause IO error on virtual
block devices. (PV).

To be clear here: do you mean you *paused* and then unpaused the VMs, or 
*suspended* and then resumed the VMs? I suspect you mean the former.

Pause, of cause. My bad.

If you would do a suspend, the frontend driver flush out disk IO operations before suspend reached, and therefore there won't be anything to timeout after resume. However, if the storage driver domain just crashed, I guess the guest would crash at suspend. Maybe we can try out something to save the the ring buffer, and replay them back once the backend come back (but before resuming the guest). But I'm not sure whether the guest would handle the timeouts after the resume first, or cancel them if the requests were succesfully responded.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.