|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [patch-4.16] arm/smmuv1,v2: Protect smmu master list with a lock
On Thu, 28 Oct 2021, Julien Grall wrote:
> On Thu, 28 Oct 2021, 00:14 Stefano Stabellini, <sstabellini@xxxxxxxxxx> wrote:
> On Wed, 27 Oct 2021, Julien Grall wrote:
> > > > > > + return ret;
> > > > > > }
> > > > > > static int register_smmu_master(struct arm_smmu_device
> *smmu,
> > > > > > @@ -2056,7 +2066,10 @@ static int arm_smmu_add_device(struct
> device
> > > > > > *dev)
> > > > > > } else {
> > > > > > struct arm_smmu_master *master;
> > > > > > + spin_lock(&arm_smmu_devices_lock);
> > > > > > master = find_smmu_master(smmu, dev->of_node);
> > > > > > + spin_unlock(&arm_smmu_devices_lock);
> > > > >
> > > > > At the moment, unlocking here is fine because we don't remove
> the
> > > > > device. However, there are a series to supporting removing a
> device (see
> > > > > [1]). So I think it would be preferable to unlock after the
> last use of
> > > > > 'cfg'.
> > > > >
> > > Ok. I will move unlocking to the end of this else {} block. I was
> not aware
> > > of the patch you are referring to.
> >
> > I think the end of the else is still too early. This needs to at
> least be past
> > iommu_group_set_iommudata() because we store cfg.
> >
> > Potentially, the lock wants to also englobe
> arm_smmu_master_alloc_smes(). So I
> > am wondering whether it would be simpler to hold the lock for the
> whole
> > duration of arm_smmu_add_device() (I can see value when we will want
> to
> > interlock with the remove code).
>
> The patch was to protect the smmu master list. From that point of view
> the unlock right after find_smmu_master would be sufficient, right?
>
>
> Yes. However this is not fixing all the problems (see below).
>
>
> We only need to protect cfg if we are worried that the same device is
> added in two different ways at the same time. Did I get it right? If so,
> I would say that that case should not be possible? Am I missing another
> potential conflict?
>
>
> It should not be possible to add the same device twice. The problem is more
> when we are going to remove the device. In this case, "master"
> may disappear at any point.
>
> The support for removing device is not yet implemented in the tree. But there
> is already a patch on the ML. So I think it would be
> shortsighted to only move the lock to just solve concurrent access to the
> list.
That makes sense now: the other source of conflict is concurrent add and
remove of the same device. Sorry it wasn't clear to me before.
> I am pointing this out for two reasons:
>
> Protecting the list is different from protecting each element from
> concurrent modification of the element itself. If the latter is a
> potential problem, I wonder if arm_smmu_add_device is the only function
> affected?
>
>
> I had a brief looked at the code and couldn't find any other places where
> this may be an issue.
>
>
> The second reason is that extending the lock past
> arm_smmu_master_alloc_smes is a bit worrying because it causes
> &arm_smmu_devices_lock and smmu->stream_map_lock to nest, which wasn't
> the case before.
>
>
> Nested locks are common. I don't believe there would be a problem here with
> this one.
>
>
> I am not saying that it is a bad idea to extend the lock past
> arm_smmu_master_alloc_smes -- it might be the right thing to do.
>
>
> I don't usually suggest locking changes blindly ;).
>
> But I
>
> am merely saying that it might be best to think twice about it.
>
> and/or do
> that after 4.16.
>
>
> To be honest, this patch is not useful the callback to register a
> device in the IOMMU subsystem. The sentence makes no sense sorry. I
> meant the add callback doesn't support PCI devices. So the locking is
> a latent issue at the moment.
>
> So if you are concerned with the my suggested locking then, I am
> afraid the current patch is a no-go on my side for 4.16.
I was totally fine with it aside from the last suggestion of extending
the spin_unlock until the end of the function because until then the
changes were obviously correct to me.
I didn't understand the reason why we needed extending spin_unlock, now
I understand it. Also lock nesting is one of those thing that it is
relatively common but I think should always take a second check to make
sure it is correct. Specifically we need to check that no fuctions with
smmu->stream_map_lock taken call a function that take
&arm_smmu_devices_lock. It is not very difficult but I haven't done
this check myself.
The other thing that is not clear to me is whether we would need also to
protect places where we use (not allocate) masters and/or cfg, e.g.
arm_smmu_attach_dev, arm_smmu_domain_add_master.
> That said we can work towards a new locking approach for 4.17.
> However, I would want to have a proposal from your side or at least
> some details on why the suggested locking is not suitable.
The suggested locking approach up until the last suggestion looks
totally fine to me. The last suggestion is a bit harder to tell because
the PCI removal hook is not there yet, so I am having troubles seeing
exactly what needs to be protected.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |