[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v7 2/5] xen/rcu: don't use stop_machine_run() for rcu_barrier()



On 26.03.2020 08:24, Jürgen Groß wrote:
> On 26.03.20 07:58, Jan Beulich wrote:
>> On 25.03.2020 17:13, Julien Grall wrote:
>>> On 25/03/2020 10:55, Juergen Gross wrote:
>>>> @@ -143,51 +143,90 @@ static int qhimark = 10000;
>>>>    static int qlowmark = 100;
>>>>    static int rsinterval = 1000;
>>>>    -struct rcu_barrier_data {
>>>> -    struct rcu_head head;
>>>> -    atomic_t *cpu_count;
>>>> -};
>>>> +/*
>>>> + * rcu_barrier() handling:
>>>> + * Two counters are used to synchronize rcu_barrier() work:
>>>> + * - cpu_count holds the number of cpus required to finish barrier 
>>>> handling.
>>>> + *   It is decremented by each cpu when it has performed all pending rcu 
>>>> calls.
>>>> + * - pending_count shows whether any rcu_barrier() activity is running and
>>>> + *   it is used to synchronize leaving rcu_barrier() only after all cpus
>>>> + *   have finished their processing. pending_count is initialized to 
>>>> nr_cpus + 1
>>>> + *   and it is decremented by each cpu when it has seen that cpu_count has
>>>> + *   reached 0. The cpu where rcu_barrier() has been called will wait 
>>>> until
>>>> + *   pending_count has been decremented to 1 (so all cpus have seen 
>>>> cpu_count
>>>> + *   reaching 0) and will then set pending_count to 0 indicating there is 
>>>> no
>>>> + *   rcu_barrier() running.
>>>> + * Cpus are synchronized via softirq mechanism. rcu_barrier() is regarded 
>>>> to
>>>> + * be active if pending_count is not zero. In case rcu_barrier() is 
>>>> called on
>>>> + * multiple cpus it is enough to check for pending_count being not zero 
>>>> on entry
>>>> + * and to call process_pending_softirqs() in a loop until pending_count 
>>>> drops to
>>>> + * zero, before starting the new rcu_barrier() processing.
>>>> + */
>>>> +static atomic_t cpu_count = ATOMIC_INIT(0);
>>>> +static atomic_t pending_count = ATOMIC_INIT(0);
>>>>      static void rcu_barrier_callback(struct rcu_head *head)
>>>>    {
>>>> -    struct rcu_barrier_data *data = container_of(
>>>> -        head, struct rcu_barrier_data, head);
>>>> -    atomic_inc(data->cpu_count);
>>>> +    smp_mb__before_atomic();     /* Make all writes visible to other 
>>>> cpus. */
>>>
>>> smp_mb__before_atomic() will order both read and write. However, the
>>> comment suggest only the write are required to be ordered.
>>>
>>> So either the barrier is too strong or the comment is incorrect. Can
>>> you clarify it?
>>
>> Neither is the case, I guess: There simply is no smp_wmb__before_atomic()
>> in Linux, and if we want to follow their model we shouldn't have one
>> either. I'd rather take the comment to indicate that if one appeared, it
>> could be used here.
> 
> Right. Currently we have the choice of either using
> smp_mb__before_atomic() which is too strong for Arm, or smp_wmb() which
> is too strong for x86.

For x86 smp_wmb() is actually only very slightly too strong - it expands
to just barrier(), after all. So overall perhaps that's the better
choice here (with a suitable comment)?

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.