|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC PATCH] dpci: Put the dpci back on the list if running on another CPU.
On Mon, Jan 12, 2015 at 11:45:30AM -0500, Konrad Rzeszutek Wilk wrote:
> There is race when we clear the STATE_SCHED in the softirq
> - which allows the 'raise_softirq_for' to progress and
> schedule another dpci. During that time the other CPU could
> receive an interrupt and calls 'raise_softirq_for' and put
> the dpci on its per-cpu list. There would be two 'dpci_softirq'
> running at the same time (on different CPUs) where the
> dpci state is STATE_RUN (and STATE_SCHED is cleared). This
> ends up hitting:
>
> if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
> BUG()
>
> Instead of that put the dpci back on the per-cpu list to deal
> with later.
>
> The reason we can get his with this is when an interrupt
> affinity is set over multiple CPUs.
>
> Another potential fix would be to add a guard in the raise_softirq_for
> to check for 'STATE_RUN' bit being set and not schedule the
> dpci until that bit has been cleared.
>
> Reported-by: Sander Eikelenboom <linux@xxxxxxxxxxxxxx>
> Reported-by: Malcolm Crossley <malcolm.crossley@xxxxxxxxxx>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> ---
> xen/drivers/passthrough/io.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
> index ae050df..9b77334 100644
> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -804,7 +804,17 @@ static void dpci_softirq(void)
> d = pirq_dpci->dom;
> smp_mb(); /* 'd' MUST be saved before we set/clear the bits. */
> if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
> - BUG();
> + {
> + unsigned long flags;
> +
> + /* Put back on the list and retry. */
> + local_irq_save(flags);
> + list_add_tail(&pirq_dpci->softirq_list, &this_cpu(dpci_list));
I chatted with Jan on IRC this, and one worry is that if we add
on our per-cpu list an 'dpci' that is running on another CPU - if
the other CPU runs 'list_del' it will go BOOM.
However, I am not sure if I can come up with a scenario in which
this will be triggered - as by the time we get to checking
the STATE_RUN condition the list removal has already been done.
So adding in on the per-cpu list is OK - and since the STATE_SCHED
is set, it guards against double-list addition.
CPU1 CPU2
CPU3:
softirq_dpci:
list_del() from per-cpu-list
test-and-set(STATE_RUN)
test-and-clear(STATE_SCHED)
..
raise_softirq
test-and-set(STATE_SCHED)
softirq_dpci
list_del from per-cpu-list
if-test-and-set(STATE_RUN)
fails:
list_add_tail(..)
continue
raise_softirq
test-and-set(STATE_SCHED)
[fails, so no adding]
..
softirq_dpci
list_del
if-test-and-set(STATE_RUN)
fails:
list_add_tail(..) [OK,
we did the list_del]
continue..
clear_bit(STATE_RUN)
softrqi_dpci
list_del()
test-and-set(STATE_RUN)
test-and-clear(STATE_SCHED)
> + local_irq_restore(flags);
> +
> + raise_softirq(HVM_DPCI_SOFTIRQ);
> + continue;
> + }
> /*
> * The one who clears STATE_SCHED MUST refcount the domain.
> */
> --
> 2.1.0
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |