[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4] xenstore: extend the xenstore ring with a 'closing' signal



Hi,

On 10 Sep 2014, at 14:35, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:

> On Wed, 2014-09-03 at 17:25 +0100, David Scott wrote:
>> Hvmloader uses the xenstore ring and then tries to reset it back
>> to its initial state before booting the guest. Occasionally xenstored
>> will read the ring while it is being zeroed and conclude it has
>> been corrupted. This prevents PV drivers from loading in the guest.
>> 
>> This patch updates the xenstore ring protocol definition, enabling
>> a server to advertise additional features to the guest. One such feature
>> is defined: the ability to cleanly reset the ring (but not the
>> higher-level protocol, for which we already have RESET_WATCHES).
> 
> Is RESET_WATCHES complete enough to be considered a higher-level
> protocol reset, as opposed to just doing the watch stuff? (maybe there
> is no other state to speak of?)

I believe the only high-level connection state is watches + outstanding 
transactions. The xenstore.txt doc says that RESET_WATCHES should reset both:

RESET_WATCHES           |
        Reset all watches and transactions of the caller.


> 
>> This patch implements the ring reconnection features in oxenstored
>> and hvmloader, fixing the bug.
> 
> I suppose it is worth mentioning here that C xenstored is untouched but
> that the hvmloader changes have been written to work with an arbitrary
> xenstore by using the protocol. (at least I hope it has!)

That’s right — hvmloader is flexible.

> 
>> This patch also defines an 'invalid' xenstore packet type and uses this
>> to poison the ring over a reconnect. This will make diagnosing this
>> bug much easier in future.
>> 
>> Signed-off-by: David Scott <dave.scott@xxxxxxxxxx>
> 
> I've made some comments below but I think it might be worth waiting for
> Ian Jackson's input, since he had comments last time, before rushing to
> make any changes.

Sounds sensible to me.

Thanks,
Dave

> 
>> 
>> ---
>> 
>> Updates since v3
>> * hvmloader: signal the event channel after requesting a reconnect
>>  (thanks to Zheng Li for diagnosing this)
>> * switch to using feature flags/bits instead of version numbers
>> * rename the 'closing' field to 'connection state'
>> * define an invalid packet type (0xffff, since 0x0 was already taken)
>> * on ring connection/reset, fill the input and output buffers with
>>  the invalid packet
>> * ocaml xs_ring_stubs.c: collapse both {read,write} functions into one,
>>  remove #define ERROR_FOO
>> * doc: write from the guest's point of view, weaken guarantees made by
>>  server (e.g. producer = consumer != 0 is valid after reconnect)
>> * doc: relate reconnection to the RESET_WATCHES command in the higher
>>  level protocol
>> * doc: be more specific about who must write what, when
>> 
>> Updates since v2
>> * hvmloader: use volatile for read of closing flag
>> * style improvements
>> * remove xenstore version #defines
>> 
>> Updates since v1
>> * remove unused client version and associated dead code
>> * treat 'closing' as a flag by using "!=0" rather than "==1"
>> * hvmloader: move function up and remove forward decl
>> * document the existing xenstore ring and the extention in misc/
>> ---
>> docs/misc/xenstore-ring.txt         |  112 
>> +++++++++++++++++++++++++++++++++++
>> tools/firmware/hvmloader/xenbus.c   |   48 +++++++++------
>> tools/ocaml/libs/xb/xb.ml           |   27 ++++++++-
>> tools/ocaml/libs/xb/xb.mli          |    1 +
>> tools/ocaml/libs/xb/xs_ring.ml      |   10 ++++
>> tools/ocaml/libs/xb/xs_ring_stubs.c |  109 ++++++++++++++++++++++++----------
>> xen/include/public/io/xs_wire.h     |   13 +++-
>> 7 files changed, 269 insertions(+), 51 deletions(-)
>> create mode 100644 docs/misc/xenstore-ring.txt
>> 
>> diff --git a/docs/misc/xenstore-ring.txt b/docs/misc/xenstore-ring.txt
>> new file mode 100644
>> index 0000000..6d18556
>> --- /dev/null
>> +++ b/docs/misc/xenstore-ring.txt
> 
> You seem to use markdown like syntax, so please name it .markdown and we
> will get some sort of useful (well, ish) HTML out of it at build time.
> 
>> @@ -0,0 +1,112 @@
>> +The xenstore ring is a datastructure stored within a single 4KiB page
>> +shared between the xenstore server and the guest. The ring contains
>> +two queues of bytes -- one in each direction -- and some signalling
>> +information. The [xenstore protocol](xenstore.txt) is layered on top of
>> +the byte streams.
>> +
>> +The xenstore ring datastructure
>> +===============================
>> +
>> +The following table describes the ring structure where
>> +  - offsets and lengths are in bytes;
>> +  - "Input" is used to describe the data sent to the server; and
>> +  - "Output" is used to describe the data sent to the domain.
>> +
>> +Offset  Length  Description
>> +-----------------------------------------------------------------
>> +0       1024    Input data
>> +1024    1024    Output data
>> +2048    4       Input consumer offset
>> +2052    4       Input producer offset
>> +2056    4       Output consumer offset
>> +2060    4       Output producer offset
>> +2064    4       Server feature bitmap
>> +2068    4       Connection state
>> +
>> +The Input data and Output data are circular buffers. Each buffer is
>> +associated with a pair of free-running offsets labelled "consumer" and
>> +"producer".
>> +
>> +A "producer" offset is the offset in the byte stream of the next byte
>> +to be written modulo 2^32. A "consumer" offset is the offset in the byte
>> +stream of the next byte to be read modulo 2^32. Implementations must
>> +take care to handle wraparound properly when performing arithmetic with
>> +these values.
>> +
>> +The byte at offset 'x' in the byte stream will be stored at offset
>> +'x modulo 1024' in the circular buffer.
>> +
>> +Implementations may only overwrite previously-written data if it has
>> +been marked as 'consumed' by the relevant consumer pointer.
>> +
>> +When the guest domain is created, there is no outstanding Input or Output
>> +data. However
>> +
>> +  - guests must not assume that producer or consumer pointers start
>> +    at zero; and
>> +  - guests must not assume that unused bytes in either the Input or
>> +    Output data buffers has any particular value.
>> +
>> +A xenstore ring is always associated with an event channel. Whenever the
>> +ring structure is updated the event channel must be signalled. The
>> +guest and server are free to inspect the contents of the ring at any
>> +time, not only in response to an event channel event. This implies that
>> +updates must be ordered carefully to ensure consistency.
>> +
>> +The xenstore server may decide to advertise some features via the
>> +"Server feature bitmap". The server can start advertising features
>> +at any time by setting bits but it will never stop advertising features
>> +i.e. bits will never be cleared. The guest is not permitted to write to
>> +the server feature bitmap. The server features are offered to the guest;
>> +it is up to the guest whether to use them or not. The guest should ignore
>> +any unknown feature bits.
>> +
>> +The following features are defined:
>> +
>> +Mask    Description
>> +-----------------------------------------------------------------
>> +1       Ring reconnection (see the ring reconnection feature below)
>> +
>> +The "Connection state" field is used to request a ring close and reconnect.
>> +The "Connection state" field only contains valid data if the server has
>> +advertised the ring reconnection feature. If the feature has been advertised
>> +then the "Connection state" may take the following values:
>> +
>> +Value   Description
>> +-----------------------------------------------------------------
>> +0       Ring is connected
>> +1       Ring close and reconnect is in progress (see the "ring
>> +        reconnection feature" described below)
>> +
>> +The ring reconnection feature
>> +=============================
>> +
>> +The ring reconnection feature allows the guest to ask the server to
>> +reset the ring to a valid initial state i.e. where the Input and Output
>> +queues contain no data. The feature operates at the ring-level only;
>> +it does not affect the higher-level protocol state machine at all.
>> +To reset theh higher-level protocol, please read about the 'RESET_WATCHES'
> 
> typo here: theh.
> 
>> +command in the [xenstore protocol](xenstore.txt) specification.
>> +
>> +The ring reconnection feature is only available if the 'Ring reconnection'
>> +feature bit has been set by the server in the "Server feature bitmap".
>> +
>> +Assuming the server has advertised the feature, the guest can initiate
>> +a reconnection by setting the the Connection state to 1 ("Ring close
>> +and reconnect is in progress") and signalling the event channel.
>> +The guest must now ignore all fields except the Connection state and
>> +wait for it to be set to 0 ("Ring is connected")
>> +
>> +The server will guarantee to
>> +
>> +  - drop any partially read or written higher-level
>> +    [xenstore protocol](xenstore.txt) packets it may have;
>> +  - empty the Input and Output queues in the xenstore ring;
>> +  - set the Connection state to 0 ("Ring is connected"); and
>> +  - signal the event channel.
>> +
>> +From the point of view of the guest, the connection has been reset on a
>> +packet boundary.
>> +
>> +Note that only the guest may set the Connection state to 1 and only the
>> +server may set it back to 0.
> 
>> [...]
>> -    /* We zero out the whole ring -- the backend can handle this, and it's 
>> -     * not going to surprise any frontends since it's equivalent to never 
>> -     * having used the rings. */
>> -    memset(rings, 0, sizeof *rings);
>> -
>> +    if (rings->server_features & XENSTORE_SERVER_FEATURE_RECONNECTION) {
>> +        rings->connection = XENSTORE_RECONNECT;
>> +        send.port = event;
>> +        hypercall_event_channel_op(EVTCHNOP_send, &send);
>> +        while (*(volatile uint32_t*)&rings->connection == 
>> XENSTORE_RECONNECT)
> 
> I don't think you need the volatile here, certainly none of the other
> ring accesses seem to use it.
> 
> Apart from that the C side of things looks good. I've not looked at the
> ocaml side of things, do you have someone you can lean on to do some
> review or do I need to blow the cobwebs off that part of my brain?
> 
> Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.