|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/msi: fix locking for SR-IOV devices
On 15.08.2024 03:28, Stewart Hildebrand wrote:
> On 8/13/24 10:01, Jan Beulich wrote:
>> On 12.08.2024 22:39, Stewart Hildebrand wrote:
>>> In commit 4f78438b45e2 ("vpci: use per-domain PCI lock to protect vpci
>>> structure") a lock moved from allocate_and_map_msi_pirq() to the caller
>>> and changed from pcidevs_lock() to read_lock(&d->pci_lock). However, one
>>> call path wasn't updated to reflect the change, leading to a failed
>>> assertion observed under the following conditions:
>>>
>>> * PV dom0
>>> * Debug build (debug=y) of Xen
>>> * There is an SR-IOV device in the system with one or more VFs enabled
>>> * Dom0 has loaded the driver for the VF and enabled MSI-X
>>>
>>> (XEN) Assertion 'd || pcidevs_locked()' failed at
>>> drivers/passthrough/pci.c:535
>>> (XEN) ----[ Xen-4.20-unstable x86_64 debug=y Not tainted ]----
>>> ...
>>> (XEN) Xen call trace:
>>> (XEN) [<ffff82d040284da8>] R pci_get_pdev+0x4c/0xab
>>> (XEN) [<ffff82d040344f5c>] F arch/x86/msi.c#read_pci_mem_bar+0x58/0x272
>>> (XEN) [<ffff82d04034530e>] F
>>> arch/x86/msi.c#msix_capability_init+0x198/0x755
>>> (XEN) [<ffff82d040345dad>] F arch/x86/msi.c#__pci_enable_msix+0x82/0xe8
>>> (XEN) [<ffff82d0403463e5>] F pci_enable_msi+0x3f/0x78
>>> (XEN) [<ffff82d04034be2b>] F map_domain_pirq+0x2a4/0x6dc
>>> (XEN) [<ffff82d04034d4d5>] F allocate_and_map_msi_pirq+0x103/0x262
>>> (XEN) [<ffff82d04035da5d>] F physdev_map_pirq+0x210/0x259
>>> (XEN) [<ffff82d04035e798>] F do_physdev_op+0x9c3/0x1454
>>> (XEN) [<ffff82d040329475>] F pv_hypercall+0x5ac/0x6af
>>> (XEN) [<ffff82d0402012d3>] F lstar_enter+0x143/0x150
>>>
>>> In read_pci_mem_bar(), the VF obtains the struct pci_dev pointer for its
>>> associated PF to access the vf_rlen array. This array is initialized in
>>> pci_add_device() and is only populated in the associated PF's struct
>>> pci_dev.
>>>
>>> Add a link from the VF's struct pci_dev to the associated PF struct
>>> pci_dev, ensuring the PF's struct doesn't get deallocated until all its
>>> VFs have gone away. Access the vf_rlen array via the new link to the PF,
>>> and remove the troublesome call to pci_get_pdev().
>>>
>>> Add a call to pci_get_pdev() inside the pcidevs_lock()-locked section of
>>> pci_add_device() to set up the link from VF to PF. In case the new
>>> pci_add_device() invocation fails to find the associated PF (returning
>>> NULL), we are no worse off than before: read_pci_mem_bar() will still
>>> return 0 in that case.
>>>
>>> Note that currently the only way for Xen to know if a device is a VF is
>>> if the toolstack tells Xen about it. Using PHYSDEVOP_manage_pci_add for
>>> a VF is not a case that Xen handles.
>>
>> How does the toolstack come into play here? It's still the Dom0 kernel to
>> tell Xen, via PHYSDEVOP_pci_device_add (preferred) or
>> PHYSDEVOP_manage_pci_add_ext (kind of deprecated; PHYSDEVOP_manage_pci_add
>> is even more kind of deprecated).
>
> I guess I meant to say Dom0 kernel, not toolstack. I'm actually
> questioning how much value this last portion of the commit description
> is really adding. Maybe it would be better to just remove this bit.
+1
>>> @@ -446,7 +448,27 @@ static void free_pdev(struct pci_seg *pseg, struct
>>> pci_dev *pdev)
>>>
>>> list_del(&pdev->alldevs_list);
>>> pdev_msi_deinit(pdev);
>>> - xfree(pdev);
>>> +
>>> + if ( pdev->info.is_virtfn )
>>> + {
>>> + struct pci_dev *pf_pdev = pdev->virtfn.pf_pdev;
>>> +
>>> + if ( pf_pdev )
>>> + {
>>> + list_del(&pdev->virtfn.next);
>>> + if ( pf_pdev->physfn.dealloc_pending &&
>>> + list_empty(&pf_pdev->physfn.vf_list) )
>>> + xfree(pf_pdev);
>>> + }
>>> + xfree(pdev);
>>> + }
>>> + else
>>> + {
>>> + if ( list_empty(&pdev->physfn.vf_list) )
>>> + xfree(pdev);
>>> + else
>>> + pdev->physfn.dealloc_pending = true;
>>> + }
>>
>> Could I talk you into un-indenting the last if/else by a level, by making
>> the earlier else and "else if()"?
>>
>> One aspect I didn't properly consider when making the suggestion: What if,
>> without all VFs having gone away, the PF is re-added? In that case we
>> would better recycle the existing structure. That's getting a little
>> complicated, so maybe a better approach is to refuse the request (in
>> pci_remove_device()) when the list isn't empty?
>
> Hm. Right. The growing complexity of maintaining these PF<->VF links
> (introduced on a hunch that they may be useful in the future) is making
> me favor the previous approach from v2 of simply copying the vf_len
> array. So unless there are any objections I'll go back to that approach
> for v4.
I continue to consider the earlier approach at least undesirable. The
vf_len[] array shouldn't be copied around, risking for it to be / go
stale. There should be one central place for every bit of information,
unless there are very good reasons to duplicate something.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |