[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling

To: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
From: Jürgen Groß <jgross@xxxxxxxx>
Date: Mon, 2 Mar 2020 15:32:08 +0100
Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Wei Liu <wl@xxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
Delivery-date: Mon, 02 Mar 2020 14:32:13 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 02.03.20 15:23, Igor Druzhinin wrote:

On 02/03/2020 14:03, Jürgen Groß wrote:

On 02.03.20 14:25, Igor Druzhinin wrote:

On 28/02/2020 07:10, Jürgen Groß wrote:


I think you are just narrowing the window of the race:

It is still possible to have two cpus entering rcu_barrier() and to
make it into the if ( !initial ) clause.

Instead of introducing another atomic I believe the following patch
instead of yours should do it:

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index e6add0b120..0d5469a326 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -180,23 +180,17 @@ static void rcu_barrier_action(void)

   void rcu_barrier(void)
   {
-    int initial = atomic_read(&cpu_count);
-
       while ( !get_cpu_maps() )
       {
           process_pending_softirqs();
-        if ( initial && !atomic_read(&cpu_count) )
+        if ( !atomic_read(&cpu_count) )
               return;

           cpu_relax();
-        initial = atomic_read(&cpu_count);
       }

-    if ( !initial )
-    {
-        atomic_set(&cpu_count, num_online_cpus());
+    if ( atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) == 0 )
           cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
-    }

       while ( atomic_read(&cpu_count) )
       {

Could you give that a try, please?


With this patch I cannot disable SMT at all.

The problem that my diff solved was a race between 2 consecutive
rcu_barrier operations on CPU0 (the pattern specific to SMT-on/off
operation) where some CPUs didn't exit the cpu_count checking loop
completely but cpu_count is already reinitialized on CPU0 - this
results in some CPUs being stuck in the loop.


Ah, okay, then I believe a combination of the two patches is needed.

Something like the attached version?


I apologies - my previous test result was from machine booted in core mode.
I'm now testing it properly and the original patch seems to do the trick but
I still don't understand how you can avoid the race with only 1 counter -
it's always possible that CPU1 is still in cpu_count checking loop (even if
cpu_count is currently 0) when cpu_count is reinitialized.


I guess this is very very unlikely.

I'm looking at your current version now. Was the removal of get_cpu_maps()
and recursion protection intentional? I suspect it would only work on the
latest master so I need to keep those for 4.13 testing.


Oh, sorry, this seems to be an old version.

Here comes the correct one.


Juergen

Attachment: v3-0002-xen-rcu-don-t-use-stop_machine_run-for-rcu_barrie.patch
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  - From: Igor Druzhinin

References:
- Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  - From: Igor Druzhinin
- Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  - From: Jürgen Groß
- Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
  - From: Igor Druzhinin

Prev by Date: Re: [Xen-devel] [PATCH] xen: do live patching only from main idle loop
Next by Date: Re: [Xen-devel] [PATCH] xen: do live patching only from main idle loop
Previous by thread: Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
Next by thread: Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.