Xen project Mailing List

Re: [Xen-devel] [RFC PATCH 01/10] ARM: vGIC: remove rank lock from IRQ routing functions

From: Stefano Stabellini <sstabellini@xxxxxxxxxx>

Date: Mon, 8 May 2017 17:47:45 -0700 (PDT)

Cc: Andre Przywara <andre.przywara@xxxxxxx>, nd@xxxxxxx, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Tue, 09 May 2017 00:47:54 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Mon, 8 May 2017, Julien Grall wrote: > Hi Stefano, > > On 08/05/2017 22:53, Stefano Stabellini wrote: > > On Mon, 8 May 2017, Julien Grall wrote: > > > Hi Andre, > > > > > > On 08/05/17 10:15, Andre Przywara wrote: > > > > On 04/05/17 16:53, Julien Grall wrote: > > > > > Hi Andre, > > > > > > > > > > On 04/05/17 16:31, Andre Przywara wrote: > > > > > > gic_route_irq_to_guest() and gic_remove_irq_from_guest() take the > > > > > > rank > > > > > > lock, however never actually access the rank structure. > > > > > > Avoid taking the lock in those two functions and remove some more > > > > > > unneeded code on the way. > > > > > > > > > > The rank is here to protect p->desc when checking that the virtual > > > > > interrupt was not yet routed to another physical interrupt. > > > > > > > > Really? To me that sounds quite surprising. > > > > My understanding is that the VGIC VCPU lock protected the pending_irq > > > > (and thus the desc pointer?) so far, and the desc itself has its own > > > > lock. > > > > According to the comment in the struct irq_rank declaration the lock > > > > protects the members of this struct only. > > > > > > > > Looking briefly at users of pending_irq->desc (for instance > > > > gicv[23]_update_lr() or gic_update_one_lr()) I can't see any hint that > > > > they care about the lock. > > > > > > > > So should that be fixed or at least documented? > > > > > > My understanding is this rank lock is preventing race between two updates > > > of > > > p->desc. This can happen if gic_route_irq_to_guest(...) is called > > > concurrently > > > with the same vIRQ but different pIRQ. > > > > > > If you drop this lock, nothing will protect that anymore. > > > > The desc->lock in route_irq_to_guest is protecting against concurrent > > changes to the same physical irq. > > > > The vgic_lock_rank in gic_route_irq_to_guest is protecting against > > concurrent changes to the same virtual irq. > > > > In other words, the current code does: > > 1) lock physical irq > > 2) lock virtual irq > > 3) establish mapping virtual irq - physical irq > > > > Andre, sorry for not seeing this earlier. > > > > > > > > > Without this locking, you can have two concurrent call of > > > > > gic_route_irq_to_guest that will update the same virtual interrupt but > > > > > with different physical interrupts. > > > > > > > > > > You would have to replace the rank lock by the per-pending_irq lock to > > > > > keep the code safe. > > > > > > > > That indeed sounds reasonable. > > > > > > As you mentioned IRL, the current code may lead to a deadlock due to > > > locking > > > order. > > > > > > Indeed routing an IRQ (route_irq_to_guest) will take: > > > 1) desc lock (in route_irq_to_guest) > > > 2) rank lock (in gic_route_irq_to_guest) > > > > > > Whilst the MMIO emulation of ISENABLER_* will take: > > > 1) rank lock > > > 2) desc lock (in vgic_enable_irqs) > > > > Yes, you are right, but I don't think that is an issue because > > route_irq_to_guest is expected to be called before the domain is > > started, which is consistent with what you wrote below that routing > > should be done *before* running the VM. > > I didn't consider as a big issue because of that and the fact it currently can > only happen via a domctl or during dom0 building. > > > > > To be clear: I am not arguing for keeping the code as-is, just trying to > > understand it. > > > > > > > Using the per-pending_irq lock will not solve the deadlock. I though a bit > > > more to the code. I believe the routing of SPIs/PPIs after domain creation > > > time is a call to mistake and locking nightmare. Similarly an interrupt > > > should > > > stay routed for the duration of the domain life. > > > > It looks like current routing code was already written with that assumption. > > > > > > > So I would forbid IRQ routing after domain creation (see > > > d->creation_finished) > > > and remove IRQ whilst routing (I think with d->is_dying). This would have > > > an > > > race between the routing and the rest of the vGIC code. > > > > I am fine with that. > > > > > > > However, this would not prevent the routing function to race against > > > itself. > > > For that I would take the vgic domain lock, it is fine because routing is > > > not > > > something we expect to happen often. > > > > > > Any opinions? > > > > The changes done by gic_route_irq_to_guest are specific to one virtual > > irq and one physical irq. In fact, they establish the mapping between > > the two. > > > > Given that concurrent changes to a physical irq are protected by the > > desc->lock in route_irq_to_guest, why do you think that taking the new > > pending_irq lock in gic_route_irq_to_guest would not be sufficient? > > (gic_route_irq_to_guest is always called by route_irq_to_guest.) > > Because if we just replace the rank lock by the pending lock there deadlock > would still be there. I guess we can take the pending_irq lock in > route_irq_to_guest before the desc lock. I think that's right, it would work. > But I like the idea of hiding the vgic locking in gic_route_irq_to_guest to > keep irq.c fairly Interrupt Controller agnostic. Did you suggest to take the vgic domain lock in gic_route_irq_to_guest, instead of the rank lock? It would work, but it would establish a new locking order constraint: desc->lock first, then d->arch.vgic.lock, which is a bit unnatural. For example, the d->arch.vgic.lock is taken by vgic-v2 and vgic-v3 in response to guest MMIO traps. Today, they don't do much, but it is easy to imagine that in the future you might want to take the desc->lock after the d->arch.vgic.lock, the same way we take the desc->lock after the rank lock today. On the other end, if we take the pending_irq lock before desc->lock we don't add any new order constraints. That's my preference But maybe I misunderstood and you meant to suggest taking the d->arch.vgic.lock before the desc->lock in route_irq_to_guest? I guess that would work too. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.