[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4] xenstore: extend the xenstore ring with a 'closing' signal
On 22 Sep 2014, at 17:38, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> wrote: > On 3 Sep 2014, at 17:25, David Scott <dave.scott@xxxxxxxxxx> wrote: >> Hvmloader uses the xenstore ring and then tries to reset it back >> to its initial state before booting the guest. Occasionally xenstored >> will read the ring while it is being zeroed and conclude it has >> been corrupted. This prevents PV drivers from loading in the guest. > > This protocol doc is almost perfect - see below. I see that Ian C has > reviewed the C implementation. Have you found anyone to provide a 2nd > opinion on the ocaml ? I’ve cc:d Jon Ludlam. Jon: do you have time to review OCaml part of the patch? > >> --- /dev/null >> +++ b/docs/misc/xenstore-ring.txt >> +The xenstore server may decide to advertise some features via the >> +"Server feature bitmap". The server can start advertising features >> +at any time by setting bits but it will never stop advertising features >> +i.e. bits will never be cleared. The guest is not permitted to write to >> +the server feature bitmap. The server features are offered to the guest; >> +it is up to the guest whether to use them or not. The guest should ignore >> +any unknown feature bits. > > The rule that the server may start advertising features at any time is > reasonable but it needs to be accompanied by a promise, for any > specific feature, of the latest time at which the server will normally > advertise it. Otherwise the client won't know when to check it and > whether it can sensibly cache the result. > >> +Mask Description >> +----------------------------------------------------------------- >> +1 Ring reconnection (see the ring reconnection feature below) >> + >> +The "Connection state" field is used to request a ring close and reconnect. >> +The "Connection state" field only contains valid data if the server has >> +advertised the ring reconnection feature. If the feature has been advertised >> +then the "Connection state" may take the following values: > > For example, it would be useful to say here that this feature is > normally advertised from the start of the guest's operation. OK. > >> +The ring reconnection feature >> +============================= >> + >> +The ring reconnection feature allows the guest to ask the server to >> +reset the ring to a valid initial state i.e. where the Input and Output >> +queues contain no data. The feature operates at the ring-level only; >> +it does not affect the higher-level protocol state machine at all. >> +To reset theh higher-level protocol, please read about the 'RESET_WATCHES' >> +command in the [xenstore protocol](xenstore.txt) specification. >> + >> +The ring reconnection feature is only available if the 'Ring reconnection' >> +feature bit has been set by the server in the "Server feature bitmap". >> + >> +Assuming the server has advertised the feature, the guest can initiate >> +a reconnection by setting the the Connection state to 1 ("Ring close >> +and reconnect is in progress") and signalling the event channel. >> +The guest must now ignore all fields except the Connection state and >> +wait for it to be set to 0 ("Ring is connected") >> + >> +The server will guarantee to >> + >> + - drop any partially read or written higher-level >> + [xenstore protocol](xenstore.txt) packets it may have; >> + - empty the Input and Output queues in the xenstore ring; >> + - set the Connection state to 0 ("Ring is connected"); and >> + - signal the event channel. > > I think the server needs to guarantee that there are no outstanding > commands, somehow. As you write this, the server might have a command > being processed which is no longer in the Input queue but whose > response has not yet arrived in the Output queue. > > You might say that the client can use the response to RESET_WATCHES to > distinguish replies to old commands (before the RESET_WATCHES reply) > from replies to new ones. But (a) the server does not guarantee to > process messages in order and (b) the client might get unlucky and use > the same request id for its RESET_WATCHES as one of the `old' > commands. I see what you mean. For a ‘clean’ reconnect like hvmloader we can be sure that there are no outstanding requests because hvmloader has already received all the replies. For ‘unclean’ reconnects like a crash-kernel start there is definitely a problem. Since the request ids are local to the connection, and the connection has just closed and re-opened, I think it makes sense for the server to drop all old request ids. The client would then be able to safely use any request id for the RESET_WATCHES without fearing a old reply coming back. It would be even nicer from a client perspective (although I’m not sure how this would affect the diff) if the ring connection re-open performed a RESET_WATCHES automatically, since then it would act the same as closing and re-opening your Unix domain socket. Cheers, Dave > I assume (without checking) that there is a suitable guarantee which > could be written down and which is actually implemented by both > existing xenstored implementations... > > Thanks, > Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |