[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [Patch] continue_hypercall_on_cpu rework using tasklets
On 16/04/2010 07:42, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote: > Thanks for pointing out the stop_machine_run deadlock issue. > > After more consideration and internally discussion, seems the key point is, > the tasklet softirq is something like getting a lock for the current vcpu's > state(i.e. no one else could change that state unless this softirq is > finished). > > So any block action in softirq context, not just vcpu_pause_sync, > is dangerous and should be avoided (we can't get a lock and do block action > per my understanding). > This is because vcpu's state can only be changed by schedule softirq (am I > right on this?), while schedule softirq can't prempt other softirq. So, more > generally, anything that will be updated by a softirq context, and will be > syncrhozed/blocking waitied in xen's vcpu context is in fact a implicit lock > held by the softirq. I think you're at risk of over-analysing the situation. You cannot safely synchronously pause vcpus from within softirq context. I'm not sure we can extrapolate further than that. > To the tricky bug on the stop_machine_run(), I think it is in fact similar to > the cpu_add_remove_lock. The stop_machine_run() is a block action, so we must > make sure no one will be blocking to get the lock that is held by > stop_machine_run() already. At that time, we change all components that try to > get the cpu_add_remove_lock to be try_lock. > > The changes caused by the tasklet is, a new implicit lock is added, i.e. the > vcpu's state. Let me be clear: the deadlock I described in stop_machine_run() is *not* introduced by the c_h_o_c reimplementation. It has been there all along. Nothing in my description of the deadlock depended on the implementation of c_h_o_c: it depended only on the baheviour of stop_machine_run itself, which does not internally use c_h_o_c. > The straightforward method is like the cpu_add_remove_lock: make everything > that waiting for the vcpu state change to do softirq between the checking. Possible, could have impacts of its own of course. We couldn't do SCHEDULE_SOFTIRQ as we would lose the caller's context, and the caller could not hold locks when pausing others (although I suppose they generally can't anyway). > Maybe the cleaner way is your previous suggestion, that is, put the > stop_machine_run() in the idle_vcpu(), another way is, turn back to the > original method, i.e. do it in the schedule_tail. Argh. That would be annoying! Another possibility is to... shudder... introduce a timeout. Wait only e.g., 1ms for all vcpus to enter the STOPMACHINE_PREPARE state and if they don't, cancel the operation and return -EBUSY. There are already a bunch of opportunities to return 'spurious' -EBUSY already in the cpu-offline paths, so tools already need to know to retry at least a certain number of times. Undoubtedly this is the dirty solution, but it is easy-ish to implement and should only be allowing us to avoid an extremely rare deadlock possibility. It would just be critical to pick a large enough timeout! > Also We are not sure why the continue_hypercall_on_cpu is changed to use > tasklet. What's the benifit for it? At least I think this is quite confusing, > because per our understanding, usually hypercall is assumed to execut in > current context, while this change break the assumption. So any hypercall that > may use this _c_h_o_c, and any function called by that hypercall, should be > aware of this, I'm not sure if this is really so correct, at least it may > cause trouble if someone use this without realize the limitation. From Juergen > Gross's mail, it seems for cpupool, but I have no idea of the cpupool :-( The implementation is simpler and lets us get rid of the complexity around vcpu affinity logic. There aren't that many users of c_h_o_c and most are still trivially correct. I don't think the changes to c_h_o_c have introduced any more deadlocks, apart from the freeze_domains issue (which I believe I have now fixed). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |