[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] driver domain crash and reconnect handling

> -----Original Message-----
> From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel-
> bounces@xxxxxxxxxxxxx] On Behalf Of George Shuklin
> Sent: 24 January 2013 11:46
> To: Zoltan Kiss
> Cc: 'xen-devel@xxxxxxxxxxxxx'; Dave Scott; Ian Campbell; xen-
> api@xxxxxxxxxxxxx
> Subject: Re: [Xen-devel] driver domain crash and reconnect handling
> >> I expect the outage due to the proto-suspend is dwarfed by the outage
> >> caused by a backend going away for however long it takes to notice,
> >> rebuild, reset the hardware, etc etc.
> > Indeed, probably the backend restoration would take at least 5
> > seconds. Compared to that, the suspend-resume and the frontend device
> > reinit is much shorter.
> > Probably in storage driver domains it's better to suspend the guest
> > immediately when the backend is gone, as the guest can easily crash if
> > the block device is inaccessible for a long time. In case of network
> > access, this isn't such a big problem.
> >
> >
> Some notes about guest suspend during IO.
> I tested that way for storage reboot (pause all domains, reboot ISCSI storage
> and resume every domain). If pause is short (less that 2 minutes), guest can
> survive. If pause is longer than 2 minutes, guests in state of waiting for io
> completion, detects IO timeout after resuming  and cause IO error on virtual
> block devices. (PV).

To be clear here: do you mean you *paused* and then unpaused the VMs, or 
*suspended* and then resumed the VMs? I suspect you mean the former.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.