Xen project Mailing List

Re: [Xen-devel] [PATCH 6/9] libxl: Asynchronous/long-running operation infrastructure

To: Ian Campbell <Ian.Campbell@xxxxxxxxxx>

From: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>

Date: Tue, 24 Jan 2012 17:27:57 +0000

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 24 Jan 2012 17:28:29 +0000

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Thanks for the thorough review. Ian Campbell writes ("Re: [Xen-devel] [PATCH 6/9] libxl: Asynchronous/long-running operation infrastructure"): > You've done device removal, do you have a list of other things which > should use this? (perhaps with an associated list of people's names...) No, I don't have such a list, but offhand: domain creation (bootloader, etc.) qmp device attach domain destruction bootloader vncviewer exec (or vncviewer parameter fetching - api should change) (these may overlap). > > + * If ao_how==NULL, the function will be synchronous. > > + * > > + * If ao_how!=NULL, the function will set the operation going, and > > + * if this is successful will return ERROR_ASYNCH_INPROGRESS. > > There's an extra H here compared with the actual symbol name (I think > the symbol is right). Fixed everywhere. > Is there a possibility that libxl might decide that the operation isn't > actually going to take all the long and do things synchronously, > returning normal success (e.g. 0)? Is that the reason for the separate > return code for this "I did what you asked me" case? No, even if the operation completes quickly, the same exit path is taken. So if you ask for a callback or an event, you get that even if the callback or event happened before your initiating function returns. As I say: * Note that it is possible for an asynchronous operation which is to * result in a callback to complete during its initiating function * call. In this case the initating function will return * ERROR_ASYNC_INPROGRESS, even though by the time it returns the * operation is complete and the callback has already happened. If ao_how is non-NULL then these functions cannot return 0. If it is NULL they cannot return ASYNC_INPROGRESS. I chose to use a new exit status because it seemed safer but that's a matter of taste and if you prefer I could return 0 for that case. > Can we drop the ERROR_ prefix? I know that's inconsistent with the other > return codes but those actually are errors. I guess we could but isn't this going to become a proper IDL enum at some point ? > > + * > > + * If ao_how->callback!=NULL, the callback will be called when the > > + * operation completes. The same rules as for libxl_event_hooks > > + * apply, including the reentrancy rules and the possibility of > > ^ (see above/below) -- depending on how these comments end up > relative to each other. It's actually in a different file. I'll add a cross-reference. > > + * "disaster", except that libxl calls ao_how->callback instead of > > + * libxl_event_hooks.event_occurs. > > + * > > + * If ao_how->callback==NULL, a libxl_event will be generated which > > + * can be obtained from libxl_event_wait or libxl_event_check. > > Or be delivered via event_occurs? Yes, in principle, although that would be a silly thing to ask for. Why would you want your ao completions delivered via some central callback function just so that you could split them up again ? > > + * call. In this case the initating function will return > > initiating Fixed. > > +typedef struct { > > + void (*callback)(libxl_ctx *ctx, int rc, void *for_callback); > > + union { > > + libxl_ev_user for_event; /* used if callback==NULL */ > > + void *for_callback; /* passed to callback */ > > Why void * for one bit of "closure" but an explicit uint64_t for the > other. I nearly commented on the use of uint64_t previously -- void *, > or perhaps (u)intptr_t is more normal. The context value in an event needs to be marshallable to a foreign language or a foreign process. So it can't be a pointer. Or at least, it has to be a type which can contain integers as well as pointers, and an integer type is better for that. The context value for a callback function can be a void* because you don't ever need to "marshal" the "callback", or if you do you can wrap up the context appropriately. > > + * Completion after initiator return (asynch. only): ... > > + * * initiator calls ao_complete: | > > + * - observes event not net done, | > > + * - returns to caller | > > + * | > > + * - ao completes on some thread > > + * - generate the event or call the callback > > + * - destroy the ao > > Where does ao_inprogress fit into these diagrams? There's a mistake in the diagrams: where it says on the left "initiator calls ao_complete" it should read "... ao_inprogress", and likewise for "ao_complete takes the lock". I will fix this. > > + */ > > + > > +void libxl__ao__destroy(libxl_ctx *ctx, libxl__ao *ao) { > > CODING_STYLE wants these braces on the next line (a bunch more follow) Oh yes, will fix. > > +void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc) { > > + assert(ao->magic == LIBXL__AO_MAGIC); > > + assert(!ao->complete); > > + ao->complete = 1; > > + ao->rc = rc; > > + > > + if (ao->poller) { > > + assert(ao->in_initiator); > > + libxl__poller_wakeup(egc, ao->poller); > > + } else if (ao->how.callback) { > > + LIBXL_TAILQ_INSERT_TAIL(&egc->aos_for_callback, ao, > > entry_for_callback); > > + } else { > > + libxl_event *ev; > > + ev = NEW_EVENT(egc, OPERATION_COMPLETE, ao->domid); > > + if (ev) { > > + ev->for_user = ao->how.u.for_event; > > + ev->u.operation_complete.rc = ao->rc; > > + libxl__event_occurred(egc, ev); > > + } > > + ao->notified = 1; > > + } > > + if (!ao->in_initiator && ao->notified) > > + libxl__ao__destroy(libxl__gc_owner(&egc->gc), ao); > > You added a helper for this libxl__gc_owner(&egc..) construct. You mean EGC_GC and CTX. I don't think that's a good idea here because it obscures exactly what's going on. In particular, there are two gcs here - the ao's and the egc's - and one of them may be about to evaporate. > > + ao = calloc(sizeof(*ao),1); > > calloc is actually (nmemb, size). I'm sure it doesn't really matter > though. I'll fix it though. > > + rc = eventloop_iteration(&egc,ao->poller); > > + if (rc) { > > + /* Oh dear, this is quite unfortunate. */ > > + LIBXL__LOG(CTX, LIBXL__LOG_ERROR, "Error waiting for" > > + " event during long-running operation (rc=%d)", > > rc); > > + sleep(1); > > + /* It's either this or return ERROR_I_DONT_KNOW_WHETHER > > + * _THE_THING_YOU_ASKED_FOR_WILL_BE_DONE_LATER_WHEN > > + * _YOU_DIDNT_EXPECT_IT, since we don't have any kind of > > + * cancellation ability. */ > > Does this constitute a "disaster" (in the special hook sense)? No, disaster just lets us say that some events may be lost and an ao completion might not be an event. disaster doesn't let us randomly store up ongoing activity and have it happen when not expected. For example, a caller asking a synchronous operation does not expect to get an error code and then have the operation continue in the background anyway. > > + * Usually callback function should use GET_CONTAINING_STRUCT > > Now called CONTAINER_OF Fixed. Sometimes I wish the compiler could look into comments and spot when I've done this kind of thing. > > + * to obtain its own structure, containing a pointer to the ao, > > + * and then use the gc from that ao. > > + */ > > + > > +#define AO_CREATE(ctx, domid, ao_how) \ > > + libxl__ao *ao = libxl__ao_create(ctx, domid, ao_how); \ > > + if (!ao) return ERROR_NOMEM; \ > > + AO_GC; \ > > + CTX_LOCK; > > Where does the unlock which balances this come from? The only unlock I > see in this patch is the temporary drop in libxl__ao_inprogress which is > matched by another lock. The AO_INPROGRESS macro is supposed to unlock it, but locks it again by mistake. Well spotted. > > +#define AO_INPROGRESS do{ \ > > + libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc); \ > > + int ao__rc = libxl__ao_inprogress(ao); \ > > + libxl__ctx_lock(ao__ctx); /* gc is now invalid */ \ > > Is this supposed to be unlock answering my question above? Likewise in > ABORT? Yes. Indeed the comment above agrees: * - If initiation is successful, the initiating function needs * to run libxl__ao_inprogress right before unlocking and * returning, and return whatever it returns (AO_INPROGRESS macro). * * - If the initiation is unsuccessful, the initiating function must * call libxl__ao_abort before unlocking and returning whatever * error code is appropriate (AO_ABORT macro). > > + return ao__rc; \ > > + }while(0) > > Can we arrange for AO_INPROGRESS and AO_ABORT to return the rc? So it > would become > return AO_INPROGRESS; That would be possible. I wasn't sure whether to do it like that. Note that AO_CREATE already might return; doing it the way I have it now seems more symmetrical. But perhaps it would make things clearer to have the return outside the macro. > Is the ({stuff,stuff,stuff,val}) syntax a gcc-ism? Yes. But I don't think that should stop us if we prefer it. > > +/* All of these MUST be called with the ctx locked. > > Except libxl__ao_create? at least according to the implementation of > AO_CREATE which takes the lock after. libxl_ao__create calls libxl__poller_get which needs to be called with the lock held. Well spotted. Thanks, Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.