[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v5 2/3] xen/pci: introduce PF<->VF links
On 10/16/24 05:52, Jan Beulich wrote: > On 11.10.2024 17:27, Stewart Hildebrand wrote: >> Add links between a VF's struct pci_dev and its associated PF struct >> pci_dev. Move the calls to pci_get_pdev()/pci_add_device() down to avoid >> dropping and re-acquiring the pcidevs_lock(). >> >> During PF removal, unlink VF from PF and mark the VF broken. As before, >> VFs may exist without a corresponding PF, although now only with >> pdev->broken = true. If the PF is removed and re-added, dom0 is expected >> to also remove and re-add the VFs. > > Right, or else the VF struct instance would remain orphaned the way you've > implemented this. Question is whether it is a reasonable assumption that a > Dom0 which failed to remove the VFs during PF removal might later > "remember" that it still needs to report VFs removed. I for one doubt that. This matches my observations: you're right, it most likely won't. I had just written it in a misleading way in the commit message. Re-adding should not occur until after VFs and PF have been removed in non-buggy scenarios. A PF driver that fails to disable SR-IOV (i.e remove VFs) during PF removal is buggy, and would lead to issues inside dom0 as well due to the orphaned/stale VF left behind in Linux dom0 itself. As mentioned in patch #1, Linux warns about this: [ 100.000000] 0000:01:00.0: driver left SR-IOV enabled after remove With the PF<->VF links in place, we now also have the ability to detect and warn about this potentially buggy condition in Xen. I have so far only observed VF-then-PF removal order in non-buggy scenarios. My only hesitation with adding such a warning in Xen is that I don't know whether removing in PF-then-VF order is legitimate. In the previous rev I had Xen removing the VFs during PF removal. In this rev, I chose to continue to rely on dom0 to remove VFs because: 1. This is the expected behavior as far as I can tell. 2. It more accurately mirrors the state of the VFs and PFs in dom0 and Xen. 3. No need to consider special cases when xsm has different policies for PF and VF removal operations. Perhaps the last sentence would be better written as: "The hardware domain is expected to remove the associated VFs before removing the PF. Print a warning in case a PF is removed with associated VFs still present." Or, not implying any removal order: "If the PF is removed, the hardware domain is expected to also remove the associated VFs." I'm personally leaning toward not implying any removal order (i.e. same situation as before this patch). >> @@ -703,7 +696,44 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, >> * extended function. >> */ >> if ( pdev->info.is_virtfn ) >> - pdev->info.is_extfn = pf_is_extfn; >> + { >> + struct pci_dev *pf_pdev; >> + >> + pf_pdev = pci_get_pdev(NULL, >> + PCI_SBDF(seg, info->physfn.bus, >> + info->physfn.devfn)); >> + >> + if ( !pf_pdev ) >> + { >> + ret = pci_add_device(seg, info->physfn.bus, >> info->physfn.devfn, >> + NULL, node); >> + if ( ret ) >> + { >> + printk(XENLOG_WARNING "Failed to add SR-IOV device PF >> %pp for VF %pp\n", >> + &PCI_SBDF(seg, info->physfn.bus, >> info->physfn.devfn), >> + &pdev->sbdf); >> + free_pdev(pseg, pdev); >> + goto out; >> + } >> + pf_pdev = pci_get_pdev(NULL, >> + PCI_SBDF(seg, info->physfn.bus, >> + info->physfn.devfn)); >> + if ( !pf_pdev ) >> + { >> + ASSERT_UNREACHABLE(); >> + printk(XENLOG_ERR "Failed to find SR-IOV device PF %pp >> for VF %pp\n", >> + &PCI_SBDF(seg, info->physfn.bus, >> info->physfn.devfn), >> + &pdev->sbdf); >> + free_pdev(pseg, pdev); >> + ret = -EILSEQ; >> + goto out; > > Might be helpful to have the printk() ahead of the ASSERT_UNREACHABLE(), in > the > unlikely event that the assertion would actually trigger. Agreed. > Positioning doesn't > make a difference for release builds anyway. > > Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |