[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC PATCH 01/10] xen: pci: add per-domain pci list lock
Hello Stefano, Stefano Stabellini <sstabellini@xxxxxxxxxx> writes: > On Wed, 31 Aug 2022, Volodymyr Babchuk wrote: >> domain->pdevs_lock protects access to domain->pdev_list. >> As this, it should be used when we are adding, removing on enumerating >> PCI devices assigned to a domain. >> >> This enables more granular locking instead of one huge pcidevs_lock that >> locks entire PCI subsystem. Please note that pcidevs_lock() is still >> used, we are going to remove it in subsequent patches. >> >> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx> > > I reviewed the patch, and made sure to pay extra attention to: Thank you for doing this. > - error paths > - missing locks > - lock ordering > - interruptions > Here is what I found: > > > 1) iommu.c:reassign_device_ownership and pci_amd_iommu.c:reassign_device > Both functions without any pdevs_lock locking do: > list_move(&pdev->domain_list, &target->pdev_list); > > It seems to be it would need pdevs_lock. Maybe we need to change > list_move into list_del (protected by the pdevs_lock of the old domain) > and list_add (protected by the pdev_lock of the new domain). Yes, I did as you suggested. But this leads to another potential issue. I'll describe it below, in deassign_device() part. [...] >> + spin_lock(&d->pdevs_lock); >> list_for_each_entry_safe ( pdev, tmp, &d->pdev_list, domain_list ) >> { >> bus = pdev->bus; >> devfn = pdev->devfn; >> ret = deassign_device(d, pdev->seg, bus, devfn) ?: ret; > > This causes pdevs_lock to be taken twice. deassign_device also takes a > pdevs_lock. Probably we need to change all the > spin_lock(&d->pdevs_lock) into spin_lock_recursive. You are right, I missed that deassign_device() causes iommu*_reassign_device() call. But there lies the issue: with recursive locks, reassign_device() will not be able to unlock source->pdevs_lock, but will try to take target->pdevs_lock also. This potentially might lead to deadlock, if another call to reassign_device() moves some other pdev in the opposite way at the same time. This is why I want to avoid recursive spinlocks if possible. So, I am thinking: why does IOMMU code move a pdev across domains in the first place? We are making IOMMU driver responsible of managing domain's pdevs, which does not look right, as this is the responsibility of pci.c I want to propose another approach: implement deassign_device() function in IOMMU drivers. Then, instead of calling to reassign_device() we might do the following: 1. deassign_device() 2. remove pdev from source->pdev_list 3. add pdef to target->pdev_list 4. assign_device() [...] >> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h >> index 1cf629e7ec..0775228ba9 100644 >> --- a/xen/include/xen/sched.h >> +++ b/xen/include/xen/sched.h >> @@ -457,6 +457,7 @@ struct domain >> >> #ifdef CONFIG_HAS_PCI >> struct list_head pdev_list; >> + spinlock_t pdevs_lock; > > I think it would be better called "pdev_lock" but OK either way As Jan pointed out, we are locking devices as in plural. On other hand, I can rename pdev_list to pdevs_lists to make this consistent. > > >> #endif >> >> #ifdef CONFIG_HAS_PASSTHROUGH >> -- >> 2.36.1 >>
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |