[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] bug in hvmloader xenbus_shutdown logic

> -----Original Message-----
> From: Dave Scott
> Sent: 16 June 2014 14:44
> To: Liuqiming (John)
> Cc: Ian Campbell; Dave Scott; Yanqiangjun; xen-devel@xxxxxxxxxxxxx; Paul
> Durrant
> Subject: Re: [Xen-devel] bug in hvmloader xenbus_shutdown logic
> Hi,
> Sorry for the delay replying!
> [added Paul Durrant to cc:]
> On 6 Dec 2013, at 03:53, Liuqiming (John) <john.liuqiming@xxxxxxxxxx>
> wrote:
> > How about we add a XS_RESET operation to xenstore and the semantic
> should be: reset the connection to initial state
> >
> > The work flow asï
> > 1, Hvmloader send this request to oxenstored and then poll the event
> > 2,Oxenstored clear up IO ring and do some other work(if any) to make sure
> this connect reset to initial state
> > 3,Oxenstored send a event to notify the reset work has finished
> >
> > I think oxenstored should be the "owner" of this IO ring, so all the
> complicated operation should be done by oxenstord and other component
> should just send the request.
> I agree that (o)xenstored should be the âownerâ of this ring and other
> components should just send requests.
> Perhaps instead of sending a new type of request inside the ring, we could
> use some of the free space in the shared page to signal a reconnect? The
> current definition is:
> /*
>  * `incontents 150 xenstore_struct XenStore wire protocol.
>  *
>  * Inter-domain shared memory communications. */
> #define XENSTORE_RING_SIZE 1024
> typedef uint32_t XENSTORE_RING_IDX;
> #define MASK_XENSTORE_IDX(idx) ((idx) & (XENSTORE_RING_SIZE-1))
> struct xenstore_domain_interface {
>     char req[XENSTORE_RING_SIZE]; /* Requests to xenstore daemon. */
>     char rsp[XENSTORE_RING_SIZE]; /* Replies and async watch events. */
>     XENSTORE_RING_IDX req_cons, req_prod;
>     XENSTORE_RING_IDX rsp_cons, rsp_prod;
> };
> which means that 1024 + 1024 + 4 * 4 = 2064 bytes are used, and 2032 bytes
> are still free. I believe the 4k page is allocated at domain build time and 
> the
> unused bytes are guaranteed to be 0 (â is this correct?)
> Perhaps we could allocate 12 bytes for a version number and another pair of
> counters:
> uint32_t ring_protocol_version;
> uint32_t connection_request;
> uint32_t connection_actual;
> A new xenstored could set âring_protocol_versionâ to 1, old xenstoreds
> would leave it at 0.
> If a new xenstore client saw the ring_protocol_version = 1, it would be able
> to request a reconnect by incrementing âconnection_requestâ and waiting
> xenstored to zero the ring and increment âconnection_actualâ.
> We could then modify hvmloader to safely reset the ring before the guest
> boots. If a guest wants to jump into a crash kernel (with undefined xenstore
> ring state) it would also be able to request a reconnect and load PV drivers.
> What do you think?

I think it would be good if 'connection_actual' were an incarnation count that 
is sampled by the frontend to determine whether to trust the counter values 
e.g. in cases where it needs to wait for a response that may never come because 
xenstored has had some problem, or determines that a frontend has corrupted the 
ring. So, xenstored bumps it if it ever resets the ring either by request or 
unilaterally (and it's read-only to the guest). Then perhaps 
'connection_request' should become 'connection_reset' which is written to 1 by 
a guest requesting reset and cleared by xenstored if and when it satisfies the 

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.