[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] sched: fix race between sched_move_domain() and vcpu_wake()
On 11/10/2013 09:07, Keir Fraser wrote: > On 11/10/2013 08:12, "Jan Beulich" <JBeulich@xxxxxxxx> wrote: > >>>>> On 10.10.13 at 20:27, Keir Fraser <keir.xen@xxxxxxxxx> wrote: >>> On 10/10/2013 19:01, "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx> wrote: >>> >>>>> Just taking the lock for the old processor seemed sufficient to me as >>>>> anything seeing the new value would lock and unlock using the same new >>>>> value. But do we need to take the schedule_lock for the new processor >>>>> as well (in the right order of course)? >>>> David and I have been discussing this for a while, involving a >>>> whiteboard, and not come to a firm conclusion either way. >>>> >>>> From my point of view, holding the appropriate vcpu schedule lock >>>> entitles you to play with vcpu scheduling state, which involves >>>> following v->sched_priv which we update outside the critical region later. >>>> >>>> Only taking the one lock still leaves a race condition where another cpu >>>> can follow the new v->processor and obtain the schedule lock, at which >>>> point we have two threads both working on the internals of a vcpu. The >>>> change below certainly will fix the current bug of locking one spinlock >>>> and unlocking another. >>>> >>>> My gut feeling is that we do need to take both locks to be safe in terms >>>> of data access, but we would appreciate advice from someone more >>>> familiar with the scheduler locking. >>> If it's that tricky to work out, why not just take the two locks, >>> appropriately ordered? This isn't a hot path. >> Shouldn't we rather fix the locking mechanism itself, making >> vcpu_schedule_lock...() return the lock, such that the unlock >> will unavoidably use the correct lock? >> >> That would at once allow dropping vcpu_schedule_unlock...() >> altogether, which would be a good thing even if only considering >> the explicit uses of local_irq_disable() there (instead of using the >> right spin lock primitives). And if done that way, replacing the >> explicit uses of local_irq_enable() in the locking paths would also >> seem rather desirable - after all this defeats the spin lock >> primitives wanting to re-enable interrupts while waiting for a >> lock. > It feels to me like this is separate from Andrew's concern. Also I think > that holding the schedule_lock should protect you from changes to > v->processor. But if that's really unreasonable (e.g., inefficient) then > your suggestion here is perfectly sensible. > > Improving the vcpu_schedule_lock_irq implementations to use the providied > underlying spin_lock_irq functions would also be nice, I guess :) This is an orthogonal issue which could do with fixing. Do note that simply making changes to vcpu_schedule_lock() to return the appropriate lock is not sufficient to fix this issue, as the race with changing v->processor can cause two cpus to "successfully" lock the vcpu schedule lock for a particular vcpu. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |