[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5] xenstore: extend the xenstore ring with a 'closing' signal



Handling the exception higher up looks a fair bit nicer, and my previous
comment has been addressed, so it looks fine to me.

Reviewed-by: Jon Ludlam <jonathan.ludlam@xxxxxxxxxx>


On 25/09/14 15:58, David Scott wrote:
> Hvmloader uses the xenstore ring and then tries to reset it back
> to its initial state before booting the guest. Occasionally xenstored
> will read the ring while it is being zeroed and conclude it has
> been corrupted. This prevents PV drivers from loading in the guest.
>
> This patch updates the xenstore ring protocol definition, enabling
> a server to advertise additional features to the guest. One such feature
> is defined: the ability to cleanly reset the ring including the
> higher-level protocol, like an enhanced RESET_WATCHES for rings.
>
> This patch implements the ring reconnection features in oxenstored
> and hvmloader, fixing the bug.
>
> This patch also defines an 'invalid' xenstore packet type and uses this
> to poison the ring over a reconnect. This will make diagnosing this
> bug much easier in future.
>
> Signed-off-by: David Scott <dave.scott@xxxxxxxxxx>
>
> ---
>
> Updates since v4
> * reconnect now clears all ring state (including requests, watches and
>   transactions) -- this is the API clients will want
> * ocaml: hide the encoding of server features behind a type (thanks to
>   Jon for mentioning this)
> * docs: promise to advertise the feature before touching the queues.
>   Now if the client has received a single reply, it will know that it's
>   safe to check for the feature.
>
> Updates since v3
> * hvmloader: signal the event channel after requesting a reconnect
>   (thanks to Zheng Li for diagnosing this)
> * switch to using feature flags/bits instead of version numbers
> * rename the 'closing' field to 'connection state'
> * define an invalid packet type (0xffff, since 0x0 was already taken)
> * on ring connection/reset, fill the input and output buffers with
>   the invalid packet
> * ocaml xs_ring_stubs.c: collapse both {read,write} functions into one,
>   remove #define ERROR_FOO
> * doc: write from the guest's point of view, weaken guarantees made by
>   server (e.g. producer = consumer != 0 is valid after reconnect)
> * doc: relate reconnection to the RESET_WATCHES command in the higher
>   level protocol
> * doc: be more specific about who must write what, when
>
> Updates since v2
> * hvmloader: use volatile for read of closing flag
> * style improvements
> * remove xenstore version #defines
>
> Updates since v1
> * remove unused client version and associated dead code
> * treat 'closing' as a flag by using "!=0" rather than "==1"
> * hvmloader: move function up and remove forward decl
> * document the existing xenstore ring and the extention in misc/
> ---
>  docs/misc/xenstore-ring.txt         |  116 
> +++++++++++++++++++++++++++++++++++
>  tools/firmware/hvmloader/xenbus.c   |   48 +++++++++------
>  tools/ocaml/libs/xb/xb.ml           |   21 +++++++
>  tools/ocaml/libs/xb/xb.mli          |    4 +-
>  tools/ocaml/libs/xb/xs_ring.ml      |   28 +++++++++
>  tools/ocaml/libs/xb/xs_ring_stubs.c |  109 +++++++++++++++++++++++---------
>  tools/ocaml/xenstored/connection.ml |   16 ++++-
>  tools/ocaml/xenstored/process.ml    |   14 ++++-
>  xen/include/public/io/xs_wire.h     |   13 +++-
>  9 files changed, 316 insertions(+), 53 deletions(-)
>  create mode 100644 docs/misc/xenstore-ring.txt
>
> diff --git a/docs/misc/xenstore-ring.txt b/docs/misc/xenstore-ring.txt
> new file mode 100644
> index 0000000..16b4d0f
> --- /dev/null
> +++ b/docs/misc/xenstore-ring.txt
> @@ -0,0 +1,116 @@
> +The xenstore ring is a datastructure stored within a single 4KiB page
> +shared between the xenstore server and the guest. The ring contains
> +two queues of bytes -- one in each direction -- and some signalling
> +information. The [xenstore protocol](xenstore.txt) is layered on top of
> +the byte streams.
> +
> +The xenstore ring datastructure
> +===============================
> +
> +The following table describes the ring structure where
> +  - offsets and lengths are in bytes;
> +  - "Input" is used to describe the data sent to the server; and
> +  - "Output" is used to describe the data sent to the domain.
> +
> +Offset  Length  Description
> +-----------------------------------------------------------------
> +0       1024    Input data
> +1024    1024    Output data
> +2048    4       Input consumer offset
> +2052    4       Input producer offset
> +2056    4       Output consumer offset
> +2060    4       Output producer offset
> +2064    4       Server feature bitmap
> +2068    4       Connection state
> +
> +The Input data and Output data are circular buffers. Each buffer is
> +associated with a pair of free-running offsets labelled "consumer" and
> +"producer".
> +
> +A "producer" offset is the offset in the byte stream of the next byte
> +to be written modulo 2^32. A "consumer" offset is the offset in the byte
> +stream of the next byte to be read modulo 2^32. Implementations must
> +take care to handle wraparound properly when performing arithmetic with
> +these values.
> +
> +The byte at offset 'x' in the byte stream will be stored at offset
> +'x modulo 1024' in the circular buffer.
> +
> +Implementations may only overwrite previously-written data if it has
> +been marked as 'consumed' by the relevant consumer pointer.
> +
> +When the guest domain is created, there is no outstanding Input or Output
> +data. However
> +
> +  - guests must not assume that producer or consumer pointers start
> +    at zero; and
> +  - guests must not assume that unused bytes in either the Input or
> +    Output data buffers has any particular value.
> +
> +A xenstore ring is always associated with an event channel. Whenever the
> +ring structure is updated the event channel must be signalled. The
> +guest and server are free to inspect the contents of the ring at any
> +time, not only in response to an event channel event. This implies that
> +updates must be ordered carefully to ensure consistency.
> +
> +The xenstore server may decide to advertise some features via the
> +"Server feature bitmap". The server can start advertising features
> +at any time by setting bits but it will never stop advertising features
> +i.e. bits will never be cleared. The guest is not permitted to write to
> +the server feature bitmap. The server features are offered to the guest;
> +it is up to the guest whether to use them or not. The guest should ignore
> +any unknown feature bits.
> +
> +The following features are defined:
> +
> +Mask    Description
> +-----------------------------------------------------------------
> +1       Ring reconnection (see the ring reconnection feature below)
> +
> +The "Connection state" field is used to request a ring close and reconnect.
> +The "Connection state" field only contains valid data if the server has
> +advertised the ring reconnection feature. If the feature has been advertised
> +then the "Connection state" may take the following values:
> +
> +Value   Description
> +-----------------------------------------------------------------
> +0       Ring is connected
> +1       Ring close and reconnect is in progress (see the "ring
> +        reconnection feature" described below)
> +
> +The ring reconnection feature
> +=============================
> +
> +The ring reconnection feature allows the guest to ask the server to
> +reset the ring to a valid initial state i.e. one in which the Input
> +and Output queues contain no data and there are no outstanding requests,
> +watches or transactions.
> +
> +The ring reconnection feature is only available if the 'Ring reconnection'
> +feature bit has been set by the server in the "Server feature bitmap".
> +If a server supports ring reconnection, it will guarantee to advertise
> +the feature before producing or consuming any data from the Input or Output
> +queues.
> +
> +Assuming the server has advertised the feature, the guest can initiate
> +a reconnection by setting the the Connection state to 1 ("Ring close
> +and reconnect is in progress") and signalling the event channel.
> +The guest must now ignore all fields except the Connection state and
> +wait for it to be set to 0 ("Ring is connected")
> +
> +The server will guarantee to
> +
> +  - drop any partially read or written higher-level
> +    [xenstore protocol](xenstore.txt) packets it may have;
> +  - empty the Input and Output queues in the xenstore ring;
> +  - discard any in-flight requests
> +  - discard any watches associated with the connection
> +  - discard any transactions associated with the connection
> +  - set the Connection state to 0 ("Ring is connected"); and
> +  - signal the event channel.
> +
> +From the point of view of the guest, the connection has been reset on a
> +packet boundary.
> +
> +Note that only the guest may set the Connection state to 1 and only the
> +server may set it back to 0.
> diff --git a/tools/firmware/hvmloader/xenbus.c 
> b/tools/firmware/hvmloader/xenbus.c
> index 64c2176..f900a1e 100644
> --- a/tools/firmware/hvmloader/xenbus.c
> +++ b/tools/firmware/hvmloader/xenbus.c
> @@ -37,6 +37,19 @@ static struct xenstore_domain_interface *rings; /* Shared 
> ring with dom0 */
>  static evtchn_port_t event;                     /* Event-channel to dom0 */
>  static char payload[XENSTORE_PAYLOAD_MAX + 1];  /* Unmarshalling area */
>  
> +static void ring_wait(void)
> +{
> +    struct shared_info *shinfo = get_shared_info();
> +    struct sched_poll poll;
> +
> +    memset(&poll, 0, sizeof(poll));
> +    set_xen_guest_handle(poll.ports, &event);
> +    poll.nr_ports = 1;
> +
> +    while ( !test_and_clear_bit(event, shinfo->evtchn_pending) )
> +        hypercall_sched_op(SCHEDOP_poll, &poll);
> +}
> +
>  /* Connect our xenbus client to the backend.
>   * Call once, before any other xenbus actions. */
>  void xenbus_setup(void)
> @@ -61,14 +74,26 @@ void xenbus_setup(void)
>  void xenbus_shutdown(void)
>  {
>      struct shared_info *shinfo = get_shared_info();
> +    evtchn_send_t send;
>  
>      ASSERT(rings != NULL);
>  
> -    /* We zero out the whole ring -- the backend can handle this, and it's 
> -     * not going to surprise any frontends since it's equivalent to never 
> -     * having used the rings. */
> -    memset(rings, 0, sizeof *rings);
> -
> +    if (rings->server_features & XENSTORE_SERVER_FEATURE_RECONNECTION) {
> +        rings->connection = XENSTORE_RECONNECT;
> +        send.port = event;
> +        hypercall_event_channel_op(EVTCHNOP_send, &send);
> +        while (*(volatile uint32_t*)&rings->connection == XENSTORE_RECONNECT)
> +            ring_wait ();
> +    } else {
> +        /* If the backend reads the state while we're erasing it then the
> +         * ring state will become corrupted, preventing guest frontends from
> +         * connecting. This is rare. To help diagnose the failure, we fill
> +         * the ring with XS_INVALID packets. */
> +        memset(rings->req, 0xff, XENSTORE_RING_SIZE);
> +        memset(rings->rsp, 0xff, XENSTORE_RING_SIZE);
> +        rings->req_cons = rings->req_prod = 0;
> +        rings->rsp_cons = rings->rsp_prod = 0;
> +    }
>      /* Clear the event-channel state too. */
>      memset(shinfo->vcpu_info, 0, sizeof(shinfo->vcpu_info));
>      memset(shinfo->evtchn_pending, 0, sizeof(shinfo->evtchn_pending));
> @@ -77,19 +102,6 @@ void xenbus_shutdown(void)
>      rings = NULL;
>  }
>  
> -static void ring_wait(void)
> -{
> -    struct shared_info *shinfo = get_shared_info();
> -    struct sched_poll poll;
> -
> -    memset(&poll, 0, sizeof(poll));
> -    set_xen_guest_handle(poll.ports, &event);
> -    poll.nr_ports = 1;
> -
> -    while ( !test_and_clear_bit(event, shinfo->evtchn_pending) )
> -        hypercall_sched_op(SCHEDOP_poll, &poll);
> -}
> -
>  /* Helper functions: copy data in and out of the ring */
>  static void ring_write(const char *data, uint32_t len)
>  {
> diff --git a/tools/ocaml/libs/xb/xb.ml b/tools/ocaml/libs/xb/xb.ml
> index 29d354d..50944b5 100644
> --- a/tools/ocaml/libs/xb/xb.ml
> +++ b/tools/ocaml/libs/xb/xb.ml
> @@ -21,6 +21,10 @@ exception End_of_file
>  exception Eagain
>  exception Noent
>  exception Invalid
> +exception Reconnect
> +
> +let _ =
> +  Callback.register_exception "Xb.Reconnect" Reconnect
>  
>  type backend_mmap =
>  {
> @@ -50,6 +54,19 @@ type t =
>  let init_partial_in () = NoHdr
>       (Partial.header_size (), String.make (Partial.header_size()) '\000')
>  
> +let reconnect t = match t.backend with
> +     | Fd _ ->
> +             (* should never happen, so close the connection *)
> +             raise End_of_file
> +     | Xenmmap backend ->
> +             Xs_ring.close backend.mmap;
> +             backend.eventchn_notify ();
> +             (* Clear our old connection state *)
> +             Queue.clear t.pkt_in;
> +             Queue.clear t.pkt_out;
> +             t.partial_in <- init_partial_in ();
> +             t.partial_out <- ""
> +
>  let queue con pkt = Queue.push pkt con.pkt_out
>  
>  let read_fd back con s len =
> @@ -84,6 +101,7 @@ let write con s len =
>       | Fd backfd     -> write_fd backfd con s len
>       | Xenmmap backmmap -> write_mmap backmmap con s len
>  
> +(* NB: can throw Reconnect *)
>  let output con =
>       (* get the output string from a string_of(packet) or partial_out *)
>       let s = if String.length con.partial_out > 0 then
> @@ -102,6 +120,7 @@ let output con =
>       (* after sending one packet, partial is empty *)
>       con.partial_out = ""
>  
> +(* NB: can throw Reconnect *)
>  let input con =
>       let newpacket = ref false in
>       let to_read =
> @@ -145,6 +164,8 @@ let newcon backend = {
>  let open_fd fd = newcon (Fd { fd = fd; })
>  
>  let open_mmap mmap notifyfct =
> +     (* Advertise XENSTORE_SERVER_FEATURE_RECONNECTION *)
> +     Xs_ring.set_server_features mmap (Xs_ring.Server_features.singleton 
> Xs_ring.Server_feature.Reconnection);
>       newcon (Xenmmap {
>               mmap = mmap;
>               eventchn_notify = notifyfct;
> diff --git a/tools/ocaml/libs/xb/xb.mli b/tools/ocaml/libs/xb/xb.mli
> index 58234ae..4e1f833 100644
> --- a/tools/ocaml/libs/xb/xb.mli
> +++ b/tools/ocaml/libs/xb/xb.mli
> @@ -23,7 +23,7 @@ module Op :
>        | Resume
>        | Set_target
>        | Restrict
> -      | Invalid (* Not a valid wire operation *)
> +      | Invalid
>      val operation_c_mapping : operation array
>      val size : int
>      val array_search : 'a -> 'a array -> int
> @@ -57,6 +57,7 @@ exception End_of_file
>  exception Eagain
>  exception Noent
>  exception Invalid
> +exception Reconnect
>  type backend_mmap = {
>    mmap : Xenmmap.mmap_interface;
>    eventchn_notify : unit -> unit;
> @@ -73,6 +74,7 @@ type t = {
>    mutable partial_out : string;
>  }
>  val init_partial_in : unit -> partial_buf
> +val reconnect : t -> unit
>  val queue : t -> Packet.t -> unit
>  val read_fd : backend_fd -> 'a -> string -> int -> int
>  val read_mmap : backend_mmap -> 'a -> string -> int -> int
> diff --git a/tools/ocaml/libs/xb/xs_ring.ml b/tools/ocaml/libs/xb/xs_ring.ml
> index 9469609..48e06f4 100644
> --- a/tools/ocaml/libs/xb/xs_ring.ml
> +++ b/tools/ocaml/libs/xb/xs_ring.ml
> @@ -14,5 +14,33 @@
>   * GNU Lesser General Public License for more details.
>   *)
>  
> +module Server_feature = struct
> +     type t =
> +     | Reconnection
> +end
> +
> +module Server_features = Set.Make(struct
> +     type t = Server_feature.t
> +     let compare = compare
> +end)
> +
>  external read: Xenmmap.mmap_interface -> string -> int -> int = 
> "ml_interface_read"
>  external write: Xenmmap.mmap_interface -> string -> int -> int = 
> "ml_interface_write"
> +
> +external _internal_set_server_features: Xenmmap.mmap_interface -> int -> 
> unit = "ml_interface_set_server_features" "noalloc"
> +external _internal_get_server_features: Xenmmap.mmap_interface -> int = 
> "ml_interface_get_server_features" "noalloc"
> +
> +
> +let get_server_features mmap =
> +     (* NB only one feature currently defined above *)
> +     let x = _internal_get_server_features mmap in
> +     if x = 0
> +     then Server_features.empty
> +     else Server_features.singleton Server_feature.Reconnection
> +
> +let set_server_features mmap set =
> +     (* NB only one feature currently defined above *)
> +     let x = if set = Server_features.empty then 0 else 1 in
> +     _internal_set_server_features mmap x
> +
> +external close: Xenmmap.mmap_interface -> unit = "ml_interface_close" 
> "noalloc"
> diff --git a/tools/ocaml/libs/xb/xs_ring_stubs.c 
> b/tools/ocaml/libs/xb/xs_ring_stubs.c
> index 17656c2..fc9b0c5 100644
> --- a/tools/ocaml/libs/xb/xs_ring_stubs.c
> +++ b/tools/ocaml/libs/xb/xs_ring_stubs.c
> @@ -36,22 +36,39 @@
>  
>  #define GET_C_STRUCT(a) ((struct mmap_interface *) a)
>  
> -static int xs_ring_read(struct mmap_interface *interface,
> -                             char *buffer, int len)
> +CAMLprim value ml_interface_read(value ml_interface,
> +                                 value ml_buffer,
> +                                 value ml_len)
>  {
> +     CAMLparam3(ml_interface, ml_buffer, ml_len);
> +     CAMLlocal1(ml_result);
> +
> +     struct mmap_interface *interface = GET_C_STRUCT(ml_interface);
> +     char *buffer = String_val(ml_buffer);
> +     int len = Int_val(ml_len);
> +     int result;
> +
>       struct xenstore_domain_interface *intf = interface->addr;
>       XENSTORE_RING_IDX cons, prod; /* offsets only */
>       int to_read;
> +     uint32_t connection;
>  
>       cons = *(volatile uint32_t*)&intf->req_cons;
>       prod = *(volatile uint32_t*)&intf->req_prod;
> +     connection = *(volatile uint32*)&intf->connection;
> +
> +     if (connection != XENSTORE_CONNECTED)
> +             caml_raise_constant(*caml_named_value("Xb.Reconnect"));
> +
>       xen_mb();
>  
>       if ((prod - cons) > XENSTORE_RING_SIZE)
> -         return -1;
> +             caml_failwith("bad connection");
>  
> -     if (prod == cons)
> -             return 0;
> +     if (prod == cons) {
> +             result = 0;
> +             goto exit;
> +     }
>       cons = MASK_XENSTORE_IDX(cons);
>       prod = MASK_XENSTORE_IDX(prod);
>       if (prod > cons)
> @@ -63,21 +80,41 @@ static int xs_ring_read(struct mmap_interface *interface,
>       memcpy(buffer, intf->req + cons, len);
>       xen_mb();
>       intf->req_cons += len;
> -     return len;
> +     result = len;
> +exit:
> +     ml_result = Val_int(result);
> +     CAMLreturn(ml_result);
>  }
>  
> -static int xs_ring_write(struct mmap_interface *interface,
> -                              char *buffer, int len)
> +CAMLprim value ml_interface_write(value ml_interface,
> +                                  value ml_buffer,
> +                                  value ml_len)
>  {
> +     CAMLparam3(ml_interface, ml_buffer, ml_len);
> +     CAMLlocal1(ml_result);
> +
> +     struct mmap_interface *interface = GET_C_STRUCT(ml_interface);
> +     char *buffer = String_val(ml_buffer);
> +     int len = Int_val(ml_len);
> +     int result;
> +
>       struct xenstore_domain_interface *intf = interface->addr;
>       XENSTORE_RING_IDX cons, prod;
>       int can_write;
> +     uint32_t connection;
>  
>       cons = *(volatile uint32_t*)&intf->rsp_cons;
>       prod = *(volatile uint32_t*)&intf->rsp_prod;
> +     connection = *(volatile uint32*)&intf->connection;
> +
> +     if (connection != XENSTORE_CONNECTED)
> +             caml_raise_constant(*caml_named_value("Xb.Reconnect"));
> +
>       xen_mb();
> -     if ( (prod - cons) >= XENSTORE_RING_SIZE )
> -             return 0;
> +     if ( (prod - cons) >= XENSTORE_RING_SIZE ) {
> +             result = 0;
> +             goto exit;
> +     }
>       if (MASK_XENSTORE_IDX(prod) >= MASK_XENSTORE_IDX(cons))
>               can_write = XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(prod);
>       else 
> @@ -87,31 +124,43 @@ static int xs_ring_write(struct mmap_interface 
> *interface,
>       memcpy(intf->rsp + MASK_XENSTORE_IDX(prod), buffer, len);
>       xen_mb();
>       intf->rsp_prod += len;
> -     return len;
> +     result = len;
> +exit:
> +     ml_result = Val_int(result);
> +     CAMLreturn(ml_result);
>  }
>  
> -CAMLprim value ml_interface_read(value interface, value buffer, value len)
> +CAMLprim value ml_interface_set_server_features(value interface, value v)
>  {
> -     CAMLparam3(interface, buffer, len);
> -     CAMLlocal1(result);
> -     int res;
> +     CAMLparam2(interface, v);
> +     struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)->addr;
>  
> -     res = xs_ring_read(GET_C_STRUCT(interface),
> -                        String_val(buffer), Int_val(len));
> -     if (res == -1)
> -             caml_failwith("bad connection");
> -     result = Val_int(res);
> -     CAMLreturn(result);
> +     intf->server_features = Int_val(v);
> +
> +     CAMLreturn(Val_unit);
> +}
> +
> +CAMLprim value ml_interface_get_server_features(value interface)
> +{
> +     CAMLparam1(interface);
> +     struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)->addr;
> +
> +     CAMLreturn(Val_int (intf->server_features));
>  }
>  
> -CAMLprim value ml_interface_write(value interface, value buffer, value len)
> +CAMLprim value ml_interface_close(value interface)
>  {
> -     CAMLparam3(interface, buffer, len);
> -     CAMLlocal1(result);
> -     int res;
> -
> -     res = xs_ring_write(GET_C_STRUCT(interface),
> -                         String_val(buffer), Int_val(len));
> -     result = Val_int(res);
> -     CAMLreturn(result);
> +     CAMLparam1(interface);
> +     struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)->addr;
> +     int i;
> +
> +     intf->req_cons = intf->req_prod = intf->rsp_cons = intf->rsp_prod = 0;
> +     /* Ensure the unused space is full of invalid xenstore packets. */
> +     for (i = 0; i < XENSTORE_RING_SIZE; i++) {
> +             intf->req[i] = 0xff; /* XS_INVALID = 0xffff */
> +             intf->rsp[i] = 0xff;
> +     }
> +     xen_mb ();
> +     intf->connection = XENSTORE_CONNECTED;
> +     CAMLreturn(Val_unit);
>  }
> diff --git a/tools/ocaml/xenstored/connection.ml 
> b/tools/ocaml/xenstored/connection.ml
> index 47695f8..e2fa4e9 100644
> --- a/tools/ocaml/xenstored/connection.ml
> +++ b/tools/ocaml/xenstored/connection.ml
> @@ -45,6 +45,20 @@ let mark_as_bad con =
>       |None -> ()
>       | Some domain -> Domain.mark_as_bad domain
>  
> +let initial_next_tid = 1
> +
> +let reconnect con =
> +     Xenbus.Xb.reconnect con.xb;
> +     (* dom is the same *)
> +     Hashtbl.clear con.transactions;
> +     con.next_tid <- initial_next_tid;
> +     Hashtbl.clear con.watches;
> +     (* anonid is the same *)
> +     con.nb_watches <- 0;
> +     con.stat_nb_ops <- 0;
> +     (* perm is the same *)
> +     ()
> +
>  let get_path con =
>  Printf.sprintf "/local/domain/%i/" (match con.dom with None -> 0 | Some d -> 
> Domain.get_id d)
>  
> @@ -89,7 +103,7 @@ let create xbcon dom =
>       xb = xbcon;
>       dom = dom;
>       transactions = Hashtbl.create 5;
> -     next_tid = 1;
> +     next_tid = initial_next_tid;
>       watches = Hashtbl.create 8;
>       nb_watches = 0;
>       anonid = id;
> diff --git a/tools/ocaml/xenstored/process.ml 
> b/tools/ocaml/xenstored/process.ml
> index 89db56c..0620585 100644
> --- a/tools/ocaml/xenstored/process.ml
> +++ b/tools/ocaml/xenstored/process.ml
> @@ -377,7 +377,12 @@ let do_input store cons doms con =
>       let newpacket =
>               try
>                       Connection.do_input con
> -             with Failure exp ->
> +             with Xenbus.Xb.Reconnect ->
> +                     info "%s requests a reconnect" (Connection.get_domstr 
> con);
> +                     Connection.reconnect con;
> +                     info "%s reconnection complete" (Connection.get_domstr 
> con);
> +                     false
> +             | Failure exp ->
>                       error "caught exception %s" exp;
>                       error "got a bad client %s" (sprintf "%-8s" 
> (Connection.get_domstr con));
>                       Connection.mark_as_bad con;
> @@ -407,6 +412,11 @@ let do_output store cons doms con =
>                                (Xenbus.Xb.Op.to_string ty) (sanitize_data 
> data);*)
>                       write_answer_log ~ty ~tid ~con ~data;
>               );
> -             ignore (Connection.do_output con)
> +             try
> +                     ignore (Connection.do_output con)
> +             with Xenbus.Xb.Reconnect ->
> +                     info "%s requests a reconnect" (Connection.get_domstr 
> con);
> +                     Connection.reconnect con;
> +                     info "%s reconnection complete" (Connection.get_domstr 
> con)
>       )
>  
> diff --git a/xen/include/public/io/xs_wire.h b/xen/include/public/io/xs_wire.h
> index 585f0c8..0a0cdbc 100644
> --- a/xen/include/public/io/xs_wire.h
> +++ b/xen/include/public/io/xs_wire.h
> @@ -49,7 +49,9 @@ enum xsd_sockmsg_type
>      XS_RESUME,
>      XS_SET_TARGET,
>      XS_RESTRICT,
> -    XS_RESET_WATCHES
> +    XS_RESET_WATCHES,
> +
> +    XS_INVALID = 0xffff /* Guaranteed to remain an invalid type */
>  };
>  
>  #define XS_WRITE_NONE "NONE"
> @@ -116,6 +118,8 @@ struct xenstore_domain_interface {
>      char rsp[XENSTORE_RING_SIZE]; /* Replies and async watch events. */
>      XENSTORE_RING_IDX req_cons, req_prod;
>      XENSTORE_RING_IDX rsp_cons, rsp_prod;
> +    uint32_t server_features; /* Bitmap of features supported by the server 
> */
> +    uint32_t connection;
>  };
>  
>  /* Violating this is very bad.  See docs/misc/xenstore.txt. */
> @@ -125,6 +129,13 @@ struct xenstore_domain_interface {
>  #define XENSTORE_ABS_PATH_MAX 3072
>  #define XENSTORE_REL_PATH_MAX 2048
>  
> +/* The ability to reconnect a ring */
> +#define XENSTORE_SERVER_FEATURE_RECONNECTION 1
> +
> +/* Valid values for the connection field */
> +#define XENSTORE_CONNECTED 0 /* the steady-state */
> +#define XENSTORE_RECONNECT 1 /* guest has initiated a reconnect */
> +
>  #endif /* _XS_WIRE_H */
>  
>  /*


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.