[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 2/2] arm: proper ordering for correct execution of gic_update_one_lr and vgic_store_itargetsr
On Thu, 16 Feb 2017, Julien Grall wrote: > On 16/02/2017 22:10, Stefano Stabellini wrote: > > On Thu, 16 Feb 2017, Julien Grall wrote: > > > Hi Stefano, > > > > > > On 11/02/17 02:05, Stefano Stabellini wrote: > > > > Concurrent execution of gic_update_one_lr and vgic_store_itargetsr can > > > > result in the wrong pcpu being set as irq target, see > > > > http://marc.info/?l=xen-devel&m=148218667104072. > > > > > > > > To solve the issue, add barriers, remove an irq from the inflight > > > > queue, only after the affinity has been set. On the other end, write the > > > > new vcpu target, before checking GIC_IRQ_GUEST_MIGRATING and inflight. > > > > > > > > Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx> > > > > --- > > > > xen/arch/arm/gic.c | 3 ++- > > > > xen/arch/arm/vgic-v2.c | 4 ++-- > > > > xen/arch/arm/vgic-v3.c | 4 +++- > > > > 3 files changed, 7 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > > > > index a5348f2..bb52959 100644 > > > > --- a/xen/arch/arm/gic.c > > > > +++ b/xen/arch/arm/gic.c > > > > @@ -503,12 +503,13 @@ static void gic_update_one_lr(struct vcpu *v, int > > > > i) > > > > !test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) ) > > > > gic_raise_guest_irq(v, irq, p->priority); > > > > else { > > > > - list_del_init(&p->inflight); > > > > if ( test_and_clear_bit(GIC_IRQ_GUEST_MIGRATING, > > > > &p->status) ) > > > > { > > > > struct vcpu *v_target = vgic_get_target_vcpu(v, irq); > > > > irq_set_affinity(p->desc, > > > > cpumask_of(v_target->processor)); > > > > } > > > > + smp_mb(); > > > > + list_del_init(&p->inflight); > > > > > > I don't understand why you remove from the inflight list afterwards. If > > > you do > > > that you introduce that same problem as discussed in > > > <7a78c859-fa6f-ba10-b574-d8edd46ea934@xxxxxxx> > > > > > > As long as the interrupt is routed to the pCPU running gic_update_one_lr, > > > the > > > interrupt cannot fired because the interrupts are masked. > > > > This is not accurate: it is possible to receive a second interrupt > > notification while the first one is still active. > > Can you detail here? Because if you look at how gic_update_one_lr is called > from gic_clear_lrs, interrupts are masked. > > So it cannot be received by Xen while you are in gic_update_one_lr and before > irq_set_affinity is called. Yes, you are right, I meant generally. In this case, it is as you say. > > > However, as soon as irq_set_affinity is called the interrupt may fire > > > on the other pCPU. > > > > This is true. > > > > > > > However, list_del_init is not atomic and not protected by any lock. So > > > vgic_vcpu_inject_irq may see a corrupted version of {p,n}->inflight. > > > > > > Did I miss anything? > > > > Moving list_del_init later ensures that there are no conflicts between > > gic_update_one_lr and vgic_store_itargetsr (more specifically, > > vgic_migrate_irq). If you look at the implementation of > > vgic_migrate_irq, all checks depends on list_empty(&p->inflight). If we > > don't move list_del_init later, given that vgic_migrate_irq can be > > called with a different vgic lock taken than gic_update_one_lr, the > > following scenario can happen: > > > > > > <GIC_IRQ_GUEST_MIGRATING is set> <GIC_IRQ_GUEST_MIGRATING is set> > > CPU0: gic_update_one_lr CPU1: vgic_store_itargetsr > > ---------------------------------------------------------------------- > > remove from inflight > > clear GIC_IRQ_GUEST_MIGRATING > > read rank->vcpu (intermediate) > > It is only true if vgic_store_itargetsr is testing GIC_IRQ_GUEST_MIGRATING > here and it was clear. Right, that's the scenario, see the right colum. If you meant to say something else, I couldn't understand, sorry. I have been playing with rearranging these few lines of code in gic_update_one_lr and vgic_store_itargetsr/vgic_migrate_irq quite a bit, but I couldn't figure out a way to solve all races. This patch is one of the best outcomes I found. If you can figure out a way to rearrange this code to not be racy, but still lockless, let me know! > > set rank->vcpu (final) > > vgic_migrate_irq > > if (!inflight) irq_set_affinity > > (final) > > irq_set_affinity (intermediate) > > > > > > As a result, the irq affinity is set to the wrong cpu. With this patch, > > this problem doesn't occur. > > > > However, you are right that both in the case of gic_update_one_lr and > > vgic_migrate_irq, as well as the case of gic_update_one_lr and > > vgic_vcpu_inject_irq that you mentioned, list_del_init (from > > gic_update_one_lr) is potentially run as the same time as list_empty > > (from vgic_migrate_irq or from vgic_vcpu_inject_irq), and they are not > > atomic. > > > > Also see this other potential issue: > > http://marc.info/?l=xen-devel&m=148703220714075 > > > > All these concurrent accesses are difficult to understand and to deal > > with. This is why my original suggestion was to use the old vcpu vgic > > lock, rather then try to ensure safe concurrent accesses everywhere. > > That option is still open and would solve both problems. > > We only need to: > > > > - store the vcpu to which an irq is currently injected > > http://marc.info/?l=xen-devel&m=148237295020488 > > - check the new irq->vcpu field, and take the right vgic lock > > something like http://marc.info/?l=xen-devel&m=148237295920492&w=2, but > > would need improvements > > > > Much simpler, right? > > Would not it be easier to just take the desc->lock to protect the concurrent > access? One more lock is almost never easier :) things are going to get more entangled. Also given your accurate observation that vgic_vcpu_inject_irq wouldn't be a problem if we place list_del_init(&p->inflight) before irq_set_affinity, I think the problem is rather limited and could be solved easily with the lock we have. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |