[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 4/4] The locking order is: first rank lock, then vgic lock. The order is respected everywhere, except for gic_update_one_lr.



On Mon, 16 Jan 2017, Julien Grall wrote:
> Hi Stefano,
> 
> On 03/01/17 22:51, Stefano Stabellini wrote:
> > On Wed, 28 Dec 2016, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 22/12/16 02:15, Stefano Stabellini wrote:
> > > > gic_update_one_lr is called with the vgic lock held, but it calls
> > > > vgic_get_target_vcpu, which tries to obtain the rank lock. This can
> > > > cause deadlocks.
> > > > 
> > > > We already have a version of vgic_get_target_vcpu that doesn't take the
> > > > rank lock: __vgic_get_target_vcpu.
> > > > 
> > > > Solve the lock inversion problem, by not taking the rank lock in
> > > > gic_update_one_lr (calling __vgic_get_target_vcpu instead of
> > > > vgic_get_target_vcpu).  This is safe, because vcpu target modifications
> > > > are protected by the same vgic vcpu lock.
> > > 
> > > I maintain what I said on the first version of this patch. All this patch
> > > could be simplified by moving the call to irq_set_affinity in
> > > vgic_irq_migrate.
> > > 
> > > There are no restriction to do that and no need to have any lock taken but
> > > the
> > > rank lock.
> > 
> > All right. I'll submit a patch that does exactly that. It is not
> > perfect, because it drops interrupts in the problematic scenario, but it
> > should be a good place to start for a technical discussion.
> 
> I think I have missed something. What is the problematic scenario you are
> describing? Looking at the v3, I don't see any mention of this problem. Does
> it mean it has been solved?

No, it hasn't been solved. Sorry for not having been clearer here. The
issue is described in the commit message of
http://marc.info/?l=xen-devel&m=148348625720995:

  After migrating an interrupt from vcpu/pcpu 0 to vcpu/pcpu 1, it is
  possible to receive a physical interrupt on pcpu 1, which Xen is
  supposed to inject into vcpu 1, before the LR on pcpu 0 has been
  cleared. In this case the irq is still marked as
  GIC_IRQ_GUEST_MIGRATING, and struct pending_irq is still "inflight" on
  vcpu 0. As the irq cannot be "inflight" on vcpu 0 and vcpu 1
  simultaneously, drop the interrupt.

This is not the right behavior: the interrupt should be injected into
vcpu 1.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.