[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-API] [PATCH 0 of 4] CA-38567: Make Xapi more I/O error sensitive



On Sat, 2010-03-06 at 09:12 -0500, Dave Scott wrote:
> Hi Daniel,
> 
> Thanks for sending these patches! My only question is: can you explain
> when you would expect to see this new error? Does it imply the storage
> server or the storage network has failed (e.g. some iSCSI or NFS
> softmount timeout)? Should we ask the user to go check their cabling,
> storage server logs etc?

Hi Dave.

Your guess is about right. My understanding is that you're mainly
concerned about UI translation and possible suggestions?

Right now candidate failure points span approx. three layers.

 * blktap/vhd (metadata consistency)
 * storage interface (kernel, network transports, multipathing)
 * physical level (disk, hba,..)

Indeed, for now this leaves administrators with:

 1. Consult the system logs.

 2. In the most likely scenario, which is SCSI/iSCSI barfing:
    - Is your path broken? Check transport/cabling.
    - Is your disk dying? Check drive/adapter event logs.


Regarding the Api_errors.VDI_IO_ERROR extension.

The critical bit is not adding the identifier, it's that we obviously
need to feed /some/ representation of this type of failure back into
affected thread, as ignoring it is prohibitively dangerous.

The way the system interface works, it is not going to get any more
specific than that. However, I don't see how even less detail could be
in user's interests either.

Note that, even if the amount of available detail increases later, it's
very unlikely to ever travel back synchronously. Source would rather be
alert messages from individual subsystems. So, for any affected API call
at hand, this status code is likely going to stay as is.

Daniel

> Cheers,
> Dave
> 
> > -----Original Message-----
> > From: xen-api-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-api-
> > bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Daniel Stodden
> > Sent: 05 March 2010 23:21
> > To: Xen API
> > Subject: [Xen-API] [PATCH 0 of 4] CA-38567: Make Xapi more I/O error
> > sensitive
> > 
> > 
> > VDI copy/init/provision ops in xapi are all buffered. Means I/O
> > completes only deferred. The agent should carefully sync back buffers
> > to disk and test file status, or gust corruption goes unnoticed.
> > 
> > Presently covers:
> >  * XVA import
> >  * Raw VDI import
> >  * VDI.copy
> >  * VM.provision (e.g. the Debian Etch post-installation script)
> > 
> > There might be more, which I've been missing then.
> > 
> > Some of the changes below just use O_SYNC on the output fd. Most
> > assume a prior xen-api-libs patch to make Unixext.fsync provide the
> > necessary error detail. See related hg-email.
> > 
> > _______________________________________________
> > xen-api mailing list
> > xen-api@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/mailman/listinfo/xen-api



_______________________________________________
xen-api mailing list
xen-api@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/mailman/listinfo/xen-api


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.