[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] tools/libxl: Fixes to stream v2 task joining logic



On 24/07/15 12:41, Ian Jackson wrote:
> Andrew Cooper writes ("[PATCH] tools/libxl: Fixes to stream v2 task joining 
> logic"):
>> During review of the libxl migration v2 series, I changes the task
>> joining logic, but clearly didn't think the result through
>> properly. This patch fixes several errors.
> This would have been much easier to review if it had been split into 3
> patches.  I have gone mostly by the commit message because it was hard
> to see hunk belonged to what.

I have split them up. 

>
>> 3) Avoid stacking of check_all_finished() via synchronous teardown of
>> tasks.  If the _abort() functions call back synchronously,
>> stream->completion_callback() ends up getting called twice, as first and
>> last check_all_finished() frames observe each task being finished.
> I think this part of the patch is fine.
>
>
>> 1) Do not call check_all_finished() in the success cases of
>> libxl__xc_domain_{save,restore}_done().  It serves no specific purpose
>> as the save helper state will report itself as inactive by this point,
>> and avoids triggering a second stream->completion_callback() in the case
>> that write_toolstack_record()/stream_continue() record errors
>> synchronously themselves.
> "Serves no specific purpose" other than having a single exit path,
> which makes matters much less confusing.

"Serves no specific purpose" in so far as what check_all_finished()
would do in the success case.

>
> I think the problem may be that libxl__xc_domain_{save,restore}_done
> fail to "return" after "write_toolstack_record" and "stream_continue".
> That seems like simply a bug.  I'm sorry that I didn't notice it in
> review.

After some more thought, I don't believe that my fix is necessarily
correct.  If a condition were to exist where the stream had recorded an
error and abort()'ed the save helper, but the save helper was already
exiting with a success condition, then the callback wouldn't be fired at
all.  I have a proposed alternate solution.

>
> In general each callback function should set up exactly one other
> callback.  If it does anything else then reentrancy hazards arise.

The entire point of this logic is that there are multiple operations
going on in parallel.  It is not guaranteed that a save helper will ever
be spawned on the read side (although this would be a very useless
stream).  The state of the libxl stream read/write object is
deliberately separate from the save helper.

>
> Also, it is confusing and perhaps wrong that write_toolstack_record
> calls stream_complete.  What if there are other threads of control
> outstanding ?

This is exactly the problem which check_all_finished() is supposed to
solve, but currently doesn't.

>
>
>> 2) Only ever set stream->rc in stream_complete().  The first version of
>> the migration v2 series had separate rc and joined_rc parameters, where
>> this logic worked.  However when combining the two, the teardown path
>> fails to trigger if stream_done() records stream->rc itself.  A side
>> effect of this is that stream_done() needs to take an rc parameter.
> "the teardown path fails to trigger if stream_done() records
> stream->rc itself" but in the code I am looking at neither of the
> functions stream_done assign to rc.

I have no idea why I wrote what I did.  The code was correct but the
description was wrong.  I have fixed it in the split version of this patch.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.