[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 16/18] libxl: introduce libxl_userdata_unlink



On Fri, 2014-08-29 at 11:37 +0100, Wei Liu wrote:
> On Thu, Aug 28, 2014 at 09:44:04PM +0100, Ian Campbell wrote:
> > On Thu, 2014-08-28 at 21:27 +0100, Wei Liu wrote:
> > > On Thu, Aug 28, 2014 at 08:31:39PM +0100, Ian Campbell wrote:
> > > > On Thu, 2014-08-28 at 20:04 +0100, Wei Liu wrote:
> > > > > Application locking is one thing, but we still need to serialise libxl
> > > > > access to those files, don't we? Any access to userdata store via 
> > > > > libxl
> > > > > API should be serialised. The reason is stated in previous patch 
> > > > > "libxl:
> > > > > properly lock user data store".
> > > > 
> > > > I may be confused here, so please correct me if I'm wrong...
> > > > 
> > > > Any individual userdata store/retrieve is atomic insofar as afterwards
> > > > there will be a consistent copy of the thing there, i.e. if there is a
> > > > race you will get one or the other copy of the data, but never a
> > > > mixture. Locking within the store/retrieve function neither helps nor
> > > > hinders  this (since to the loser of the race the result is
> > > > indistinguishable from someone coming along 1s later and updating).
> > > > 
> > > > The locking is there to protect against read-modify-write cycles (e.g.
> > > > updating the config), which necessarily implies taking the lock before
> > > > the read and releasing it after the write -- i.e. at the application
> > > > layer (the libxl-lock being a kind of special in-libxl "application"
> > > > layer). Without the lock two entities racing in the r-m-w cycle can
> > > > result in updates being lost.
> > > 
> > > You're right on the r-m-w analysis. But the lock does more than that. In
> > > this specific API family that manipulates userdata store, it also
> > > ensures files won't disappear until other thread that holds the lock
> > > finishes its job. Userdata vanishes under our feet is one abnormal state
> > > we would like to avoid, userdata reappears after we delete it is another
> > > abnormal state we would like to avoid.  If we don't hold this lock for
> > > this unlink API, we now have the chance to come into those two abnormal
> > > states. Does this make sense?
> > 
> 
> I think I know where I got lost.
> 
> In my previous patch "libxl: properly lock user data store", I got your
> ack. In that patch, public API like libxl_userdata_{store,retrieve} also
> hold libxl-lock, while in this API you suggest not to hold this lock.
> 
> > Yes, it makes sense to lock removal against any r-m-w in another thread.
> > 
> > But I don't think it follows the libxl_userdata_unlink should take any
> > particular lock, including the libxl lock, that would be the
> > responsibility of the caller.
> > 
> 
> Responsibility of the caller -- yes, this is true. One precondition is
> that we have applications that only touches userdata that belongs to
> itself, but in reality we don't. Applications might touch other userdata
> files when they don't even know about it.

Can they? How? To do this an application would have to pass some other
entities userid to a libxl_userdata function, which they would surely
know about.

I think it is up to the entity which defines the userdata "userid" to
also define the required locking when accessing that particular
userdata. By necessity I think that locking would need to be implemented
at the level of that entity, not at a lower level (e.g. libxl), since
the userdata API does not include a read-modify-write interface. A third
party which fiddles with someone else's userdata without following the
defined protocol would be buggy.

In the case of the new libxl-json userid libxl defines that you must
hold the libxl-lock, but that should be entirely internal to libxl and
not exposed via general userdata API functions, I think. It's somewhat
confusing in this case because libxl is both "the entity which defines
the userdata userid" and the thing which provides the generic userdata
accessor methods.

BTW I'm not sure why we need libxl-json and libxl-lock (I'd failed to
grok that they differed), rather than locking a userdata path with a
different wh parameter, e.g. "l" as I think Ian J suggested before.

> 
> Libxl does unlink files. libxl__userdata_destroyall unlinks every
> userdata files belonged to a particular domain. So consider if there's
> no lock in these APIs:
> 
>       Task 1                                 Task 2
>       Do random stuffs to domain            Trying to shut down
>        reads "task1" userdata
>                                                observe domid
>                                                start domain destruction
>                                                delete all userdata
>                                                destroy domain
>        writes "task1" userdata
>        [ forbidden state: userdata leaked ]
> 
> Task1 and task2 need not to be the same application.
> 
> > For libxl-json userdata manipulation, you don't have this issue because
> > you aren't using unlink, I don't think. If libxl is using unlink
> > internally then it should arrange to hold the lock while calling unlink,
> > just like for r-m-w.
> > 
> > For xl cfg userdata the lock which you are making libxl_userdata_unlink
> > touch is not used by xl when doing r-m-w operations of the data (if it
> > even does such) so it doesn't protect you against anything.
> > 
> 
> Against file removal by other applications. Otherwise application might
> try to unlink a file that doesn't exist.

For this to help you would need a lock which could be held from before
the applications check that the userdata is present until after the
unlink, which I don't think can be done inside libxl_userdata_unlink,
can it?

> 
> > If xl is doing r-m-w then it needs its own lock, and it should also
> > arrange to hold that lock over the unlink. This shouldn't be the
> > libxl-lock, it should be some lock which belongs to xl and it needs to
> > be taken at the application level.
> > 
> > 
> > > OK, TBH I don't quite like this API either. If we don't provide a way
> > > for xl to delete xl cfg userdata, what should we do with xl cfg? What do
> > > you suggest to achieve the said behavior of "xl config-update"?
> > 
> > I hope the above makes the clearer, but either xl needs a lock of its
> > own or it doesn't, but in no case is this libxl's business...
> > 
> 
> Yes, I understand what your said. Application should lock file when it sees
> fit. But this only works when other applications guarantee not to step
> outside its own realm...
> 
> I should have mentioned libxl does unlink files earlier, sorry.

libxl__userdata_destroyall does complicate things somewhat but I'm not
convinced that holding some libxl internal lock actual solves the
problem you are hoping to solve.

libxl__userdata_destroyall is only called during domain destruction. By
this point the application has in some sense already relinquished
control of the domain (and by extension the userdata)

It might be useful if we discuss this face to face, perhaps with a
whiteboard?



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.