Xen project Mailing List

Re: [patch-4.16] arm/smmuv1,v2: Protect smmu master list with a lock

To: Julien Grall <julien@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

From: Michal Orzel <michal.orzel@xxxxxxx>

Date: Thu, 28 Oct 2021 14:10:15 +0200

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, bertrand.marquis@xxxxxxx, Ian Jackson <iwj@xxxxxxxxxxxxxx>

Delivery-date: Thu, 28 Oct 2021 12:10:48 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Julien, On 27.10.2021 19:02, Julien Grall wrote: > > > On 27/10/2021 11:41, Michal Orzel wrote: >> Hi Julien, > > Hi Michal, > >> On 26.10.2021 18:56, Julien Grall wrote: >>> Hi, >>> >>> On 26/10/2021 17:28, Julien Grall wrote: >>>> On 26/10/2021 13:29, Michal Orzel wrote: >>>>> If a device is added to SMMUv1/v2 from DT and PCI >>>>> at the same time, there is a concurrent access >>>>> to a smmu master list. This could lead to a >>>>> scenario where one is looking into a list that >>>>> is being modified at the same time. Add a lock >>>>> to prevent this issue. >>>> >>>> Did you intend to say "Hold" rather than "Add"? >>>> >> Yes, this is what I meant. I will change it. >> >>>>> >>>>> Reuse the existing spinlock arm_smmu_devices_lock >>>>> as it is already protecting find_smmu_master. >>>>> >>>>> ipmmu-smmu and smmuv3 are not impacted by this >>>>> issue as there is no access or modification of >>>>> a global resource during add_device. >>>>> >>>>> Signed-off-by: Michal Orzel <michal.orzel@xxxxxxx> >>>>> --- >>>>> This patch aims for 4.16 release. >>>>> Benefits: >>>>> Remove a bug that could lead to a corruption of the >>>>> smmu master list, which would be very hard to debug. >>>> >>>> From my understanding, this corruption would only happen with >>>> CONFIG_HAS_PCI. At the moment, this is a experimental feature as it is not >>>> fully complete. >>> >>> Actually, digging through the code, I noticed that iommu_add_device() >>> assume that it can only be called with platform/DT. Thankfully, AFAICT, the >>> function would return -ENXIO because the fwspec would not be allocated for >>> PCI device. >>> >>> So was this patch tested with additional patch on top? >>> >> The purpose of this patch is to fix the issue that is present and which you >> thankfully noticed. >> There is still a lot of patches to be added that aim to support PCI >> passthrough. >> The complete PCI series is tested with SMMU and it works. >> But there is no chance to test this patch standalone as iommu_add_device is >> not in the correct form for PCI as of now. > > Ok. I would suggest to say this is a latent bug so it make clear that the > patch is more for hardening at the moment. > Ok I will mention it in the commit msg. >>>>> xen/drivers/passthrough/arm/smmu.c | 25 +++++++++++++++++++------ >>>>> 1 file changed, 19 insertions(+), 6 deletions(-) >>>>> >>>>> diff --git a/xen/drivers/passthrough/arm/smmu.c >>>>> b/xen/drivers/passthrough/arm/smmu.c >>>>> index c9dfc4caa0..be62a66a28 100644 >>>>> --- a/xen/drivers/passthrough/arm/smmu.c >>>>> +++ b/xen/drivers/passthrough/arm/smmu.c >>>>> @@ -820,21 +820,25 @@ static int arm_smmu_dt_add_device_legacy(struct >>>>> arm_smmu_device *smmu, >>>>> struct device *dev, >>>>> struct iommu_fwspec *fwspec) >>>>> { >>>>> - int i; >>>>> + int i, ret; >>>>> struct arm_smmu_master *master; >>>>> struct device_node *dev_node = dev_get_dev_node(dev); >>>>> + spin_lock(&arm_smmu_devices_lock); >>>>> master = find_smmu_master(smmu, dev_node); >>>>> if (master) { >>>>> dev_err(dev, >>>>> "rejecting multiple registrations for master device %s\n", >>>>> dev_node->name); >>>>> - return -EBUSY; >>>>> + ret = -EBUSY; >>>>> + goto out_unlock; >>>>> } >>>>> master = devm_kzalloc(dev, sizeof(*master), GFP_KERNEL); >>>>> - if (!master) >>>>> - return -ENOMEM; >>>>> + if (!master) { >>>>> + ret = -ENOMEM; >>>>> + goto out_unlock; >>>>> + } >>>>> master->of_node = dev_node; >>>>> /* Xen: Let Xen know that the device is protected by an SMMU */ >>>>> @@ -846,11 +850,17 @@ static int arm_smmu_dt_add_device_legacy(struct >>>>> arm_smmu_device *smmu, >>>>> dev_err(dev, >>>>> "stream ID for master device %s greater than maximum >>>>> allowed (%d)\n", >>>>> dev_node->name, smmu->num_mapping_groups); >>>>> - return -ERANGE; >>>>> + ret = -ERANGE; >>>>> + goto out_unlock; >>>>> } >>>>> master->cfg.smendx[i] = INVALID_SMENDX; >>>>> } >>>>> - return insert_smmu_master(smmu, master); >>>>> + >>>>> + ret = insert_smmu_master(smmu, master); >>>>> + >>>>> +out_unlock: >>>>> + spin_unlock(&arm_smmu_devices_lock); >>>> >>>> Please add a newline here. >>>> >> Ok. >> >>>>> + return ret; >>>>> } >>>>> static int register_smmu_master(struct arm_smmu_device *smmu, >>>>> @@ -2056,7 +2066,10 @@ static int arm_smmu_add_device(struct device *dev) >>>>> } else { >>>>> struct arm_smmu_master *master; >>>>> + spin_lock(&arm_smmu_devices_lock); >>>>> master = find_smmu_master(smmu, dev->of_node); >>>>> + spin_unlock(&arm_smmu_devices_lock); >>>> >>>> At the moment, unlocking here is fine because we don't remove the device. >>>> However, there are a series to supporting removing a device (see [1]). So >>>> I think it would be preferable to unlock after the last use of 'cfg'. >>>> >> Ok. I will move unlocking to the end of this else {} block. I was not aware >> of the patch you are referring to. > > I think the end of the else is still too early. This needs to at least be > past iommu_group_set_iommudata() because we store cfg. > > Potentially, the lock wants to also englobe arm_smmu_master_alloc_smes(). So > I am wondering whether it would be simpler to hold the lock for the whole > duration of arm_smmu_add_device() (I can see value when we will want to > interlock with the remove code). > >> >>>> Looking at the code, I also noticed that the error path is not correct for >>>> at least the PCI device and we would leak memory. We also assume that >>>> Stream ID == Requester ID. Are both of them in your radar for PCI >>>> passthrough? >>>> >> I agree with you. I also noticed it. However this is going to be fixed with >> the next patch series when Rahul gets back. > > Good. Do you have a todo list for PCI passthrough that can be shared? > As I am not working on the PCI stuff, I do not have such a list but I can point you to the branch on which both ARM and EPAM are working: https://gitlab.com/xen-project/fusa/xen-integration/-/tree/integration/pci-passthrough > Cheers, > Cheers, Michal

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.