[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v6 1/4] xen/rcu: don't use stop_machine_run() for rcu_barrier()
On 16/03/2020 16:01, Jürgen Groß wrote: > On 16.03.20 16:24, Igor Druzhinin wrote: >> On 13/03/2020 13:06, Juergen Gross wrote: >>> Today rcu_barrier() is calling stop_machine_run() to synchronize all >>> physical cpus in order to ensure all pending rcu calls have finished >>> when returning. >>> >>> As stop_machine_run() is using tasklets this requires scheduling of >>> idle vcpus on all cpus imposing the need to call rcu_barrier() on idle >>> cpus only in case of core scheduling being active, as otherwise a >>> scheduling deadlock would occur. >>> >>> There is no need at all to do the syncing of the cpus in tasklets, as >>> rcu activity is started in __do_softirq() called whenever softirq >>> activity is allowed. So rcu_barrier() can easily be modified to use >>> softirq for synchronization of the cpus no longer requiring any >>> scheduling activity. >>> >>> As there already is a rcu softirq reuse that for the synchronization. >>> >>> Remove the barrier element from struct rcu_data as it isn't used. >>> >>> Finally switch rcu_barrier() to return void as it now can never fail. >>> >>> Partially-based-on-patch-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> >>> Signed-off-by: Juergen Gross <jgross@xxxxxxxx> >>> --- >>> V2: >>> - add recursion detection >>> >>> V3: >>> - fix races (Igor Druzhinin) >>> >>> V5: >>> - rename done_count to pending_count (Jan Beulich) >>> - fix race (Jan Beulich) >>> >>> V6: >>> - add barrier (Julien Grall) >>> - add ASSERT() (Julien Grall) >>> - hold cpu_map lock until end of rcu_barrier() (Julien Grall) >>> --- >>> xen/common/rcupdate.c | 95 >>> +++++++++++++++++++++++++++++++++------------- >>> xen/include/xen/rcupdate.h | 2 +- >>> 2 files changed, 69 insertions(+), 28 deletions(-) >>> >>> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c >>> index 03d84764d2..ed9083d2b2 100644 >>> --- a/xen/common/rcupdate.c >>> +++ b/xen/common/rcupdate.c >>> @@ -83,7 +83,6 @@ struct rcu_data { >>> struct rcu_head **donetail; >>> long blimit; /* Upper limit on a >>> processed batch */ >>> int cpu; >>> - struct rcu_head barrier; >>> long last_rs_qlen; /* qlen during the last >>> resched */ >>> /* 3) idle CPUs handling */ >>> @@ -91,6 +90,7 @@ struct rcu_data { >>> bool idle_timer_active; >>> bool process_callbacks; >>> + bool barrier_active; >>> }; >>> /* >>> @@ -143,51 +143,85 @@ static int qhimark = 10000; >>> static int qlowmark = 100; >>> static int rsinterval = 1000; >>> -struct rcu_barrier_data { >>> - struct rcu_head head; >>> - atomic_t *cpu_count; >>> -}; >>> +/* >>> + * rcu_barrier() handling: >>> + * cpu_count holds the number of cpus required to finish barrier >>> handling. >>> + * pending_count is initialized to nr_cpus + 1. >>> + * Cpus are synchronized via softirq mechanism. rcu_barrier() is >>> regarded to >>> + * be active if pending_count is not zero. In case rcu_barrier() is >>> called on >>> + * multiple cpus it is enough to check for pending_count being not >>> zero on entry >>> + * and to call process_pending_softirqs() in a loop until >>> pending_count drops to >>> + * zero, before starting the new rcu_barrier() processing. >>> + * In order to avoid hangs when rcu_barrier() is called multiple >>> times on the >>> + * same cpu in fast sequence and a slave cpu couldn't drop out of the >>> + * barrier handling fast enough a second counter pending_count is >>> needed. >>> + * The rcu_barrier() invoking cpu will wait until pending_count >>> reaches 1 >>> + * (meaning that all cpus have finished processing the barrier) and >>> then will >>> + * reset pending_count to 0 to enable entering rcu_barrier() again. >>> + */ >>> +static atomic_t cpu_count = ATOMIC_INIT(0); >>> +static atomic_t pending_count = ATOMIC_INIT(0); >>> static void rcu_barrier_callback(struct rcu_head *head) >>> { >>> - struct rcu_barrier_data *data = container_of( >>> - head, struct rcu_barrier_data, head); >>> - atomic_inc(data->cpu_count); >>> + smp_wmb(); /* Make all previous writes visible to other >>> cpus. */ >>> + atomic_dec(&cpu_count); >>> } >>> -static int rcu_barrier_action(void *_cpu_count) >>> +static void rcu_barrier_action(void) >>> { >>> - struct rcu_barrier_data data = { .cpu_count = _cpu_count }; >>> - >>> - ASSERT(!local_irq_is_enabled()); >>> - local_irq_enable(); >>> + struct rcu_head head; >>> /* >>> * When callback is executed, all previously-queued RCU work >>> on this CPU >>> - * is completed. When all CPUs have executed their callback, >>> data.cpu_count >>> - * will have been incremented to include every online CPU. >>> + * is completed. When all CPUs have executed their callback, >>> cpu_count >>> + * will have been decremented to 0. >>> */ >>> - call_rcu(&data.head, rcu_barrier_callback); >>> + call_rcu(&head, rcu_barrier_callback); >>> - while ( atomic_read(data.cpu_count) != num_online_cpus() ) >>> + while ( atomic_read(&cpu_count) ) >>> { >>> process_pending_softirqs(); >>> cpu_relax(); >>> } >>> - local_irq_disable(); >>> - >>> - return 0; >>> + atomic_dec(&pending_count); >>> } >>> -/* >>> - * As rcu_barrier() is using stop_machine_run() it is allowed to be >>> used in >>> - * idle context only (see comment for stop_machine_run()). >>> - */ >>> -int rcu_barrier(void) >>> +void rcu_barrier(void) >>> { >>> - atomic_t cpu_count = ATOMIC_INIT(0); >>> - return stop_machine_run(rcu_barrier_action, &cpu_count, NR_CPUS); >>> + unsigned int n_cpus; >>> + >>> + ASSERT(!in_irq() && local_irq_is_enabled()); >>> + >>> + for ( ;; ) >>> + { >>> + if ( !atomic_read(&pending_count) && get_cpu_maps() ) >>> + { >> >> If the whole action is happening while cpu_maps are taken why do you >> need to check pending_count first? I think the logic of this loop >> could be simplified if taken this into account. > > get_cpu_maps() can be successful on multiple cpus (its a read_lock()). > Testing pending_count avoids hammering on the cache lines. I see - the logic was changed recently. I'm currently testing this version of the patch. Igor _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |