[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c



On Wed, 15 Nov 2017, Boris Ostrovsky wrote:
> On 11/15/2017 02:09 PM, Stefano Stabellini wrote:
> > On Wed, 15 Nov 2017, Juergen Gross wrote:
> >>>>>         while(mutex_is_locked(&map->active.in_mutex.owner) ||
> >>>>>               mutex_is_locked(&map->active.out_mutex.owner))
> >>>>>                 cpu_relax();
> >>>>>
> >>>>> ?
> >>>> I'm not convinced there isn't a race.
> >>>>
> >>>> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
> >>>> then in_mutex is taken. What happens if pvcalls_front_release() resets
> >>>> sk_send_head and manages to test the mutex before the mutex is locked?
> >>>>
> >>>> Even in case this is impossible: the whole construct seems to be rather
> >>>> fragile.
> > I agree it looks fragile, and I agree that it might be best to avoid the
> > usage of in_mutex and out_mutex as refcounts. More comments on this
> > below.
> >
> >  
> >>> I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
> >>> not rely on mutex state.
> >> Yes, this would work.
> > Yes, I agree it would work and for the sake of getting something in
> > shape for the merge window I am attaching a patch for it. Please go
> > ahead with it. Let me know if you need anything else immediately, and
> > I'll work on it ASAP.
> >
> >
> >
> > However, I should note that this is a pretty big hammer we are using:
> > the refcount is global, while we only need to wait until it's only us
> > _on this specific socket_.
> 
> Can you explain why socket is important?

Yes, of course: there are going to be many open sockets on a given
pvcalls connection. pvcalls_refcount is global: waiting on
pvcalls_refcount means waiting until any operations on any unrelated
sockets stop. While we only need to wait until the operations on the one
socket we want to close stop.


> >
> > We really need a per socket refcount. If we don't want to use the mutex
> > internal counters, then we need another one.
> >
> > See the appended patch that introduces a per socket refcount. However,
> > for the merge window, also using pvcalls_refcount is fine.
> >
> > The race Juergen is concerned about is only theoretically possible:
> >
> > recvmsg:                                 release:
> >   
> >   test sk_send_head                      clear sk_send_head
> >   <only few instructions>                <prepare a message>
> >   grab in_mutex                          <send a message to the server>
> >                                          <wait for an answer>
> >                                          test in_mutex
> >
> > Without kernel preemption is not possible for release to clear
> > sk_send_head and test in_mutex after recvmsg tests sk_send_head and
> > before recvmsg grabs in_mutex.
> 
> Sorry, I don't follow --- what does preemption have to do with this? If
> recvmsg and release happen on different processors the order of
> operations can be
> 
> CPU0                                   CPU1
> 
> test sk_send_head
> <interrupt>
>                                         clear sk_send_head
>                                         <send a message to the server>
>                                         <wait for an answer>
>                                         test in_mutex
>                                         free everything
> grab in_mutex
> 
> I actually think RCU should take care of all of this.

Preemption could cause something very similar to happen, but your
example is very good too, even better, because it could trigger the
issue even with preemption disabled. I'll think more about this and
submit a separate patch on top of the simple pvcalls_refcount patch
below.



> But for now I will take your refcount-based patch. However, it also
> needs comment update.
> 
> How about
> 
> /*
>  * We need to make sure that send/rcvmsg on this socket has not started
>  * before we've cleared sk_send_head here. The easiest (though not optimal)
>  * way to guarantee this is to see that no pvcall (other than us) is in
> progress.
>  */

Yes, this is the patch:

---


xen/pvcalls: fix potential endless loop in pvcalls-front.c

mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
take in_mutex on the first try, but you can't take out_mutex. Next times
you call mutex_trylock() in_mutex is going to fail. It's an endless
loop.

Solve the problem by waiting until the global refcount is 1 instead (the
refcount is 1 when the only active pvcalls frontend function is
pvcalls_front_release).

Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
CC: boris.ostrovsky@xxxxxxxxxx
CC: jgross@xxxxxxxx


diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 0c1ec68..54c0fda 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -1043,13 +1043,12 @@ int pvcalls_front_release(struct socket *sock)
                wake_up_interruptible(&map->active.inflight_conn_req);
 
                /*
-                * Wait until there are no more waiters on the mutexes.
-                * We know that no new waiters can be added because sk_send_head
-                * is set to NULL -- we only need to wait for the existing
-                * waiters to return.
-                */
-               while (!mutex_trylock(&map->active.in_mutex) ||
-                          !mutex_trylock(&map->active.out_mutex))
+                * We need to make sure that sendmsg/recvmsg on this socket have
+                * not started before we've cleared sk_send_head here. The
+                * easiest (though not optimal) way to guarantee this is to see
+                * that no pvcall (other than us) is in progress.
+               */
+               while (atomic_read(&pvcalls_refcount) > 1)
                        cpu_relax();
 
                pvcalls_front_free_map(bedata, map);

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.