[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 5/5] make: Make "src-tarball" target actually make a source tarball

George Dunlap writes ("Re: [PATCH 5/5] make: Make "src-tarball" target actually 
make a source tarball"):
> On 07/16/2014 04:06 PM, Ian Jackson wrote:
> > Why not use a fixed filename ?  This script isn't safe for concurrent
> > invocation anyway, because the output filename is (mostly) fixed.

Basically, I'm advocating that this script should be crash-only.[1]

> I'm not so much worried about concurrent invocation per se, but about 
> dealing with stuff possibly left over from a previous run.  If we use a 
> fixed path, we have to always rm -rf the path before starting, which 
> seems unnecessarily risky to me.  (Or we could not rm -rf the path, 
> which is even worse from a consistency point of view.)  Creating a new 
> temporary directory just seems like the cleanest, safest thing to do.  
> It's not like it's that expensive.

The temporary directory should have a name with "tmp" in it somewhere.
Running rm -rf on such a thing is not risky at all.

> > If you do that you can do away with the trap handler entirely.  You
> > should probably make git_archive_into delete the destination
> > directory.
> I'm not sure what the fixed filename has to do with cleanup -- I would 
> want to remove the temporary directory on a failure regardless.

Why would you want to do that ?  Why not leave it for "make clean" or
the next run ?  Obviously it would be in .gitignore.

Build systems routinely leave temporary files and directories lying
around (on failure, or otherwise).  This is at the very least normal
and harmless.  Much of the time it is helpful because it can make it
easier to debug the build system.

> (Re concurrent invocation: The output filename may collide, but at least 
> you'll get a consistent snapshot of that one particular tree. What I'd 
> be more worried about is racing with make subdir-force-update and 
> getting inconsisent subtree archives.)

My point was simply that because the script cannot safely be run
concurrently with the same output filename, we can disregard worries
about concurrent execution as a reason for using a varying filename.
It is sufficient use a temporary filename derived from the output

[1] https://en.wikipedia.org/wiki/Crash-only_software

This is a generally good software engineering principle which we
should conform to as much as possible.  The main argument in favour
goes like this:

In general it is not possible to reliably execute cleanup code at the
time of failure.  On a Unix system processes (or whole systems) may
just die.  For example due to being eaten by the OOM killer, or being
sent untrappable signals by administrators or users trying to recover
from failures (e.g. wedged network storage).  They may also lose
access to the filesystem in ways that mimic the effect of
instantaneous death.

That means that there must be mechanisms which can recover from
abnormal (dirty) termination.  Any missed cleanup must be doable - and
done - later.  So if the system is properly designed and implemented,
any missed cleanup will be sorted out by other abnormal state recovery
mechanisms (either as part of the next invocation, or in a
housekeeping task which runs regardless).

Given that such recovery mechanisms must exist and work, attempts to
do a "tidy crash" are unnecessary.  It is much simpler to simply omit
any code which attempts to do cleanup as a response to failure, and
leave recovery to the main single recovery path.

This is also much more reliable: exercising recovery becomes a routine
part of startup (or housekeeping) and is done after both routine
failures and successful executions.  It is therefore well tested.

Contrast "tidy crash" mechanisms which are complicated, difficult to
test, and rarely tested at all let alone fed the complete panoply of
different crash cases - and are then combined with hardly-ever-tested
(or even defective-by-design) recovery arrangements.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.