[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH] dpci: Put the dpci back on the list if running on another CPU.



On Mon, Feb 02, 2015 at 03:48:19PM +0000, Jan Beulich wrote:
> >>> On 02.02.15 at 16:31, <konrad.wilk@xxxxxxxxxx> wrote:
> > On Mon, Feb 02, 2015 at 03:19:33PM +0000, Jan Beulich wrote:
> >> >>> On 02.02.15 at 15:29, <konrad.wilk@xxxxxxxxxx> wrote:
> >> > --- a/xen/drivers/passthrough/io.c
> >> > +++ b/xen/drivers/passthrough/io.c
> >> > @@ -63,10 +63,37 @@ enum {
> >> >  static void raise_softirq_for(struct hvm_pirq_dpci *pirq_dpci)
> >> >  {
> >> >      unsigned long flags;
> >> > +    unsigned long old, new, val = pirq_dpci->state;
> >> >  
> >> > -    if ( test_and_set_bit(STATE_SCHED, &pirq_dpci->state) )
> >> > -        return;
> >> > +    /*
> >> > +     * This cmpxch is a more clear version of:
> >> > +     * if ( test_and_set_bit(STATE_SCHED, &pirq_dpci->state) )
> >> > +     *         return;
> >> > +     * since it also checks for STATE_RUN conditions.
> >> > +     */
> >> > +    for ( ;; )
> >> > +    {
> >> > +        new = 1 << STATE_SCHED;
> >> > +        if ( val )
> >> > +            new |= val;
> >> 
> >> Why the if()?
> > 
> > To 'or' the variable new with '1 << STATE_RUN' in case 'val' changed from
> > the first read ('val = pirq_dpci->state') to the moment when
> > we do the cmpxchg.
> 
> But if "val" is zero, the | simply will do nothing.

Correct. Keep in mind that 'new' is set to '1 << STATE_SCHED' at every
loop iteration - so it ends up old = cmpxchg(.., 0, 1 << STATE_SCHED)
(and old == 0, val == 0, so we end up breaking out of the loop).


> 
> >> > +        old = cmpxchg(&pirq_dpci->state, val, new);
> >> > +        switch ( old )
> >> > +        {
> >> > +        case (1 << STATE_SCHED):
> >> > +        case (1 << STATE_RUN) | (1 << STATE_SCHED):
> >> > +            /*
> >> > +             * Whenever STATE_SCHED is set we MUST not schedule it.
> >> > +             */
> >> > +            return;
> >> > +        }
> >> > +        /*
> >> > +         * If the 'state' is 0 or (1 << STATE_RUN) we can schedule it.
> >> > +         */
> >> 
> >> Really? Wasn't the intention to _not_ schedule when STATE_RUN is
> >> set? (Also the above is a only line comment, i.e. wants different style.)
> > 
> > I must confess I must have misread your last review. You said:
> > 
> >     > Here is a patch that does this. I don't yet have an setup to test
> >     > the failing scenario (working on that). I removed the part in
> >     > the softirq because with this patch I cannot see a way it would
> >     > ever get there (in the softirq hitting the BUG).
> > 
> >     Hmm, but how do you now schedule the second instance that needed ...
> > 
> >     > +    case (1 << STATE_RUN):
> > 
> >     ... in this case?
> > 
> > which to me implied you still want to schedule an 'dpci' when STATE_RUN is 
> > set?
> > 
> > My thinking is that we should still schedule an 'dpci' when STATE_RUN is set
> > as that is inline with what the old tasklet code did. It would
> > schedule the tasklet the moment said tasklet was off the global list (but
> > it would not add it in the global list - that would be the job of the
> > tasklet function wrapper to detect and insert said tasklet back on
> > the global list).
> 
> Didn't the original discussion (and issue) revolve around scheduling
> while STATE_RUN was set? Hence the intention to wait for the flag

Yes.
> to clear - but preferably in an explicit rather than implicit (as your
> current and previous patch do) manner.

If we do explicitly we run risk of dead-lock. See below of an draft
(not even compiled tested) of what I think you mean.

The issue is that 'raise_softirq_for' ends up being called from do_IRQ.
And we might have an IRQ coming in just as we are in the dpci_softirq
having set the 1 << STATE_SCHED. Our spin-and-wait in raise_softirq_for
(in this code below) will spin forever.

One way to not get in that quagmire is to well, do what the previous
patches did.

diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index ae050df..706a636 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -63,10 +63,35 @@ enum {
 static void raise_softirq_for(struct hvm_pirq_dpci *pirq_dpci)
 {
     unsigned long flags;
+    unsigned long old, new, val = pirq_dpci->state;
 
-    if ( test_and_set_bit(STATE_SCHED, &pirq_dpci->state) )
-        return;
-
+    for ( ;; )
+    {
+        old = cmpxchg(&pirq_dpci->state, 0, 1 << STATE_SCHED);
+        switch ( old )
+        {
+        case (1 << STATE_SCHED):
+            /*
+             * Whenever STATE_SCHED is set we MUST not schedule it.
+             */
+            return;
+        case (1 << STATE_RUN) | (1 << STATE_SCHED):
+        case (1 << STATE_RUN):
+            /* Getting close to finish. Spin. */
+            continue;
+        }
+        /*
+         * If the 'state' is 0 we can schedule it.
+         */
+        if ( old == 0 )
+            break;
+    }
     get_knownalive_domain(pirq_dpci->dom);
 
     local_irq_save(flags);
> 
> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.