[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 6/9] libxl: Asynchronous/long-running operation infrastructure
On Tue, 2012-01-24 at 17:27 +0000, Ian Jackson wrote: > Thanks for the thorough review. > > Ian Campbell writes ("Re: [Xen-devel] [PATCH 6/9] libxl: > Asynchronous/long-running operation infrastructure"): > > You've done device removal, do you have a list of other things which > > should use this? (perhaps with an associated list of people's names...) > > No, I don't have such a list, but offhand: > domain creation (bootloader, etc.) > qmp > device attach > domain destruction > bootloader > vncviewer exec (or vncviewer parameter fetching - api should change) > (these may overlap). Thanks. > > Is there a possibility that libxl might decide that the operation isn't > > actually going to take all the long and do things synchronously, > > returning normal success (e.g. 0)? Is that the reason for the separate > > return code for this "I did what you asked me" case? > > No, even if the operation completes quickly, the same exit path is > taken. So if you ask for a callback or an event, you get that even if > the callback or event happened before your initiating function > returns. As I say: > > * Note that it is possible for an asynchronous operation which is to > * result in a callback to complete during its initiating function > * call. In this case the initating function will return > * ERROR_ASYNC_INPROGRESS, even though by the time it returns the > * operation is complete and the callback has already happened. Thanks, I think I simply hadn't got to that comment when I wrote the question. If this is a direct cut-n-paste then presumably there is a patch somewhere where "initiating" is spelled "initating" as above. Hah, I've just spotted where I pointed it out last time and you corrected it below... At least I'm consistent. > If ao_how is non-NULL then these functions cannot return 0. > If it is NULL they cannot return ASYNC_INPROGRESS. > > I chose to use a new exit status because it seemed safer but that's a > matter of taste and if you prefer I could return 0 for that case. I'm undecided (plus it seems a bit like bikeshedding). I certainly prefer either 0 or {LIBXL_}ASYNC_IN_PROGRESS to ERROR_ASYNC_IN_PROGRESS though. > > Can we drop the ERROR_ prefix? I know that's inconsistent with the other > > return codes but those actually are errors. > > I guess we could but isn't this going to become a proper IDL enum at > some point ? At which point it would become LIBXL_ERROR_{FOOS} and LIBXL_ASYNC_IN_PROGRESS? [...] > > > + * "disaster", except that libxl calls ao_how->callback instead of > > > + * libxl_event_hooks.event_occurs. > > > + * > > > + * If ao_how->callback==NULL, a libxl_event will be generated which > > > + * can be obtained from libxl_event_wait or libxl_event_check. > > > > Or be delivered via event_occurs? > > Yes, in principle, although that would be a silly thing to ask for. > Why would you want your ao completions delivered via some central > callback function just so that you could split them up again ? True. > > > + * call. In this case the initating function will return > > > > initiating > > Fixed. > > > > > +typedef struct { > > > + void (*callback)(libxl_ctx *ctx, int rc, void *for_callback); > > > + union { > > > + libxl_ev_user for_event; /* used if callback==NULL */ > > > + void *for_callback; /* passed to callback */ > > > > Why void * for one bit of "closure" but an explicit uint64_t for the > > other. I nearly commented on the use of uint64_t previously -- void *, > > or perhaps (u)intptr_t is more normal. > > The context value in an event needs to be marshallable to a foreign > language or a foreign process. So it can't be a pointer. Or at > least, it has to be a type which can contain integers as well as > pointers, and an integer type is better for that. Fair enough. > > The context value for a callback function can be a void* because you > don't ever need to "marshal" the "callback", or if you do you can wrap > up the context appropriately. > > > > + * Completion after initiator return (asynch. only): > ... > > > + * * initiator calls ao_complete: | > > > + * - observes event not net done, | > > > + * - returns to caller | > > > + * | > > > + * - ao completes on some thread > > > + * - generate the event or call the callback > > > + * - destroy the ao > > > > Where does ao_inprogress fit into these diagrams? > > There's a mistake in the diagrams: where it says on the left > "initiator calls ao_complete" it should read "... ao_inprogress", and > likewise for "ao_complete takes the lock". I will fix this. Thanks. While rereading with that substitution (and finding that it made sense) I noticed: + * * initiator calls ao_complete: | + * - observes event not net done, | You want s/net/yet/. > > > +void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc) { > > > + assert(ao->magic == LIBXL__AO_MAGIC); > > > + assert(!ao->complete); > > > + ao->complete = 1; > > > + ao->rc = rc; > > > + > > > + if (ao->poller) { > > > + assert(ao->in_initiator); > > > + libxl__poller_wakeup(egc, ao->poller); > > > + } else if (ao->how.callback) { > > > + LIBXL_TAILQ_INSERT_TAIL(&egc->aos_for_callback, ao, > > > entry_for_callback); > > > + } else { > > > + libxl_event *ev; > > > + ev = NEW_EVENT(egc, OPERATION_COMPLETE, ao->domid); > > > + if (ev) { > > > + ev->for_user = ao->how.u.for_event; > > > + ev->u.operation_complete.rc = ao->rc; > > > + libxl__event_occurred(egc, ev); > > > + } > > > + ao->notified = 1; > > > + } > > > + if (!ao->in_initiator && ao->notified) > > > + libxl__ao__destroy(libxl__gc_owner(&egc->gc), ao); > > > > You added a helper for this libxl__gc_owner(&egc..) construct. > > You mean EGC_GC and CTX. I don't think that's a good idea here > because it obscures exactly what's going on. In particular, there are > two gcs here - the ao's and the egc's - and one of them may be about > to evaporate. OK. > > > + rc = eventloop_iteration(&egc,ao->poller); > > > + if (rc) { > > > + /* Oh dear, this is quite unfortunate. */ > > > + LIBXL__LOG(CTX, LIBXL__LOG_ERROR, "Error waiting for" > > > + " event during long-running operation > > > (rc=%d)", rc); > > > + sleep(1); > > > + /* It's either this or return ERROR_I_DONT_KNOW_WHETHER > > > + * _THE_THING_YOU_ASKED_FOR_WILL_BE_DONE_LATER_WHEN > > > + * _YOU_DIDNT_EXPECT_IT, since we don't have any kind of > > > + * cancellation ability. */ > > > > Does this constitute a "disaster" (in the special hook sense)? > > No, disaster just lets us say that some events may be lost and an ao > completion might not be an event. disaster doesn't let us randomly > store up ongoing activity and have it happen when not expected. > For example, a caller asking a synchronous operation does not expect > to get an error code and then have the operation continue in the > background anyway. OK. > > > + * Usually callback function should use GET_CONTAINING_STRUCT > > > > Now called CONTAINER_OF > > Fixed. Sometimes I wish the compiler could look into comments and > spot when I've done this kind of thing. Or read the comments and DWIM so we just need to write the comments ;-) > > > + * to obtain its own structure, containing a pointer to the ao, > > > + * and then use the gc from that ao. > > > + */ > > > + > > > +#define AO_CREATE(ctx, domid, ao_how) \ > > > + libxl__ao *ao = libxl__ao_create(ctx, domid, ao_how); \ > > > + if (!ao) return ERROR_NOMEM; \ > > > + AO_GC; \ > > > + CTX_LOCK; > > > > Where does the unlock which balances this come from? The only unlock I > > see in this patch is the temporary drop in libxl__ao_inprogress which is > > matched by another lock. > > The AO_INPROGRESS macro is supposed to unlock it, but locks it again > by mistake. Well spotted. > > > > +#define AO_INPROGRESS do{ \ > > > + libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc); \ > > > + int ao__rc = libxl__ao_inprogress(ao); \ > > > + libxl__ctx_lock(ao__ctx); /* gc is now invalid */ \ > > > > Is this supposed to be unlock answering my question above? Likewise in > > ABORT? > > Yes. Indeed the comment above agrees: > > * - If initiation is successful, the initiating function needs > * to run libxl__ao_inprogress right before unlocking and > * returning, and return whatever it returns (AO_INPROGRESS macro). > * > * - If the initiation is unsuccessful, the initiating function must > * call libxl__ao_abort before unlocking and returning whatever > * error code is appropriate (AO_ABORT macro). Right. I think this highlights my concern about some code paths not being run by the in-xl users... > > > + return ao__rc; \ > > > + }while(0) > > > > Can we arrange for AO_INPROGRESS and AO_ABORT to return the rc? So it > > would become > > return AO_INPROGRESS; > > That would be possible. I wasn't sure whether to do it like that. > Note that AO_CREATE already might return; doing it the way I have it > now seems more symmetrical. > > But perhaps it would make things clearer to have the return outside > the macro. I thought there was a general preference for that sort of thing, but I suppose it depends on the required macro contortions to make it happen. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |