[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] tools/hotplug: fix locking



On Thu, 2012-06-21 at 13:20 +0100, Daniel P. Berrange wrote:
> On Thu, Jun 21, 2012 at 12:49:24PM +0100, Ian Campbell wrote:
> > I wonder if there were any followup fixes or changes, especially wrt to
> > your point 1 above (the lack of a timeout now) requiring xend (or now
> > libxl) changes?
> 
> Removing any kind of timeout was in fact the point of this change.
> We do not want hotplug scripts stealing each others locks, or
> timing out, because the timeouts cause alot of problems when
> launching guests on a highly loaded machine where execution time
> can be alot longer than a timeout setting may expect.

Agreed.

What I meant was if there was any code in xend which had inadvertently
come to rely on the existence of the timeout which needed fixing. I'd
guess not -- it'd be a pretty strange pattern...

> I expect you can't see the original BZ ticket quoted, since it
> has customer information in it.  Here is my description of
> what the problem we weere solving was:
> 
> [quote]
> In the normal case of a single domain booting with one disk, the disk hotplug
> script will fire once. In this case taking out the lock will never cause a 
> sleep
> because there's no other concurrent invocations. If a domain has 4 disks
> configured, then the hotplug script will fire 4 times, all 4 invocations at
> pretty much the same time. If there is even a little load on the system, the
> locking function in the shell script will sleep for a few seconds - as many 
> as 5
> seconds, or potentially even time out & fail completely.
> 
> If say 4 or even more domains each with 4 disks start up at once, that's 16
> invocations of the hotplug script running at once. There will be alot of 
> sleep's
> done & because of the very coarse 1 second granularity the delay can add up
> significantly. 
> 
> The change to using flock() removes the arbitrary 1 second sleep, so the very
> instant once hotplug script finishes, another can start & there is no repeated
> attempts & failures to lock which would add more delay.
> [/quote]

Thanks, this description makes sense. I'd be inclined to use it verbatim
as the commit message.

> Usually it was our policy to send all these kind of patches upstream,
> so I'm not really sure why this was not already merged. Possibly I
> forgot to submit it, or maybe it was rejected - I honestly can't
> remember.
> 
> Below is the original patch I wrote, to which I apply:
> 
>   Signed-off-by: Daniel P. Berrange.

Cheers. By my analysis the result is that the claim_lock(),
release_lock() and _setlockfd() functions (i.e. all the actual code) are
identical to those in the patch which Zhigang sent (the differences are
in the removed code, which I presume has changed since you wrote the
original patch).

Therefore I think it is appropriate to include your S-o-b on the new
patch -- assuming you are OK with that?

Also the patch is:

Acked-by: Ian Campbell <ian.campbell@xxxxxxxxxx>

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.