|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v7 01/12] vpci: introduce per-domain lock to protect vpci structure
On 10.07.2023 00:41, Volodymyr Babchuk wrote:
>
> Hi Jan,
>
> Jan Beulich <jbeulich@xxxxxxxx> writes:
>
>> On 04.07.2023 23:03, Volodymyr Babchuk wrote:
>>> I am currently implementing your proposal (along with Jan's
>>> suggestions), but I am facing ABBA deadlock with IOMMU's
>>> reassign_device() call, which has this piece of code:
>>>
>>> list_move(&pdev->domain_list, &target->pdev_list);
>>>
>>> My immediate change was:
>>>
>>> write_lock(&pdev->domain->pci_lock);
>>> list_del(&pdev->domain_list);
>>> write_unlock(&pdev->domain->pci_lock);
>>>
>>> write_lock(&target->pci_lock);
>>> list_add(&pdev->domain_list, &target->pdev_list);
>>> write_unlock(&target->pci_lock);
>>>
>>> But this will not work because reassign_device is called from
>>> pci_release_devices() which iterates over d->pdev_list, so we need to
>>> take a d->pci_lock early.
>>>
>>> Any suggestions on how to fix this? My idea is to remove a device from a
>>> list one at time:
>>>
>>> int pci_release_devices(struct domain *d)
>>> {
>>> struct pci_dev *pdev;
>>> u8 bus, devfn;
>>> int ret;
>>>
>>> pcidevs_lock();
>>> write_lock(&d->pci_lock);
>>> ret = arch_pci_clean_pirqs(d);
>>> if ( ret )
>>> {
>>> pcidevs_unlock();
>>> write_unlock(&d->pci_lock);
>>> return ret;
>>> }
>>>
>>> while ( !list_empty(&d->pdev_list) )
>>> {
>>> pdev = list_entry(&d->pdev_list, struct pci_dev, domain_list);
>>> bus = pdev->bus;
>>> devfn = pdev->devfn;
>>> list_del(&pdev->domain_list);
>>> write_unlock(&d->pci_lock);
>>> ret = deassign_device(d, pdev->seg, bus, devfn) ?: ret;
>>> write_lock(&d->pci_lock);
>>
>> I think it needs doing almost like this, but with two more tweaks and
>> no list_del() right here (first and foremost to avoid needing to
>> figure whether removing early isn't going to subtly break anything;
>> see below for an error case that would end up with changed behavior):
>>
>> while ( !list_empty(&d->pdev_list) )
>> {
>> const struct pci_dev *pdev = list_first_entry(&d->pdev_list, struct
>> pci_dev, domain_list);
>> uint16_t seg = pdev->seg;
>> uint8_t bus = pdev->bus;
>> uint8_t devfn = pdev->devfn;
>>
>> write_unlock(&d->pci_lock);
>> ret = deassign_device(d, seg, bus, devfn) ?: ret;
>> write_lock(&d->pci_lock);
>> }
>>
>> One caveat though: The original list_for_each_entry_safe() guarantees
>> the loop to complete; your use of list_del() would guarantee that too,
>> but then the device wouldn't be on the list anymore if deassign_device()
>> failed. Therefore I guess you will need another local list where you
>> (temporarily) put all the devices which deassign_device() left on the
>> list, and which you would then move back to d->pdev_list after the loop
>> has finished.
>
> Okay, taking into the account all that you wrote, I came with this
> implementation:
Looks plausible at the first glance, but will of course need looking at
in more detail in the context of the full patch. Just one (largely)
cosmetic remark right away:
> int pci_release_devices(struct domain *d)
> {
> int combined_ret;
> LIST_HEAD(failed_pdevs);
>
> pcidevs_lock();
> write_lock(&d->pci_lock);
> combined_ret = arch_pci_clean_pirqs(d);
> if ( combined_ret )
> {
> pcidevs_unlock();
> write_unlock(&d->pci_lock);
> return combined_ret;
> }
>
> while ( !list_empty(&d->pdev_list) )
> {
> struct pci_dev *pdev = list_first_entry(&d->pdev_list,
> struct pci_dev,
> domain_list);
> uint16_t seg = pdev->seg;
> uint8_t bus = pdev->bus;
> uint8_t devfn = pdev->devfn;
> int ret;
>
> write_unlock(&d->pci_lock);
> ret = deassign_device(d, seg, bus, devfn);
> write_lock(&d->pci_lock);
> if ( ret )
> {
> bool still_present = false;
> const struct pci_dev *tmp;
>
> for_each_pdev ( d, tmp )
> {
> if ( tmp == (const struct pci_dev*)pdev )
As I'm sure I expressed earlier, casts should be avoided whenever
possible. I see no reason for one here: Comparing pointer-to-
const-<type> with pointer-to-<type> is permitted by the language.
> {
> still_present = true;
> break;
> }
> }
> if ( still_present )
> list_move(&pdev->domain_list, &failed_pdevs);
> combined_ret = ret;
> }
> }
>
> list_splice(&failed_pdevs, &d->pdev_list);
> write_unlock(&d->pci_lock);
> pcidevs_unlock();
>
> return combined_ret;
> }
>
>
>> (Whether it is sufficient to inspect the list head to
>> determine whether the pdev is still on the list needs careful checking.)
>
> I am not sure that this is enough. We dropped d->pci_lock so we can
> expect that the list was permutated in a random way.
Right, if that cannot be excluded for sure, then better stay on the safe
side for now. A code comment would be nice though ahead of the
for_each_pdev().
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |