Xen project Mailing List

Re: [PATCH v6 2/3] xen/pci: introduce PF<->VF links

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>

From: Stewart Hildebrand <stewart.hildebrand@xxxxxxx>

Date: Fri, 8 Nov 2024 11:29:30 -0500

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=citrix.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=sbZBgVmUGPMICijMrrBrHw0Viq1K+Ua8gb+RrZwVJOU=; b=B6NeWRGo7F3jIsBDEYRIL2Hu/JPqvCkuSaHNm900f+V4ggnA5nyMFZ+HdJdd+zo3LVIto337QGNgngTjm0umN4IE+t0iz/wga+RaID/8BWr4bhG1cxfIyaD+orTISuhhWpnb+QaW+dNGeQWHUbq9Fx5njHYB/YkcpwLXzTr/CTxF+yu7yTpnlmknsYlMs0EO6py17nsHT3NJxzR3m0HyvHptrs0Bs86DJU4xsfunSGDwA+Sb7pZGWjJB4V1qTr4tMHAh1gacAbOSU5CcAWqpJ6ucZnD5UDJUJ0ifYeEkeIoMPKct7UHvZIr55R0Fq/Zo4kFSdD3x6t9jhkqtRrZYUw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=hiTqGOnc0dEWZQYF/lYUmMZ30q2iR39SYU3ut6om+hakN/vJSJSkbkrOKE8b2ETcGXWi0jHvrPTf+l8HHtrZINFNMSCkHrHC0ETqSPHZXtPcOyHZ98E9ho+Cl988x8Or+JZINeKwKO5rHZ35kNj3/f4JtzbnT+lCnF2t8dLkUq6NjhP5jGppbJziMKtg4pA+zB3oTEt3AvbW3URgI1J2+WpSa0NNToGmGVEJ7nOSLf7djqpwubceHRAtvw/+j5jXgoi/3ZgixYwo9N8sh4Sz5Cl93JMgtFBIUfYV9hBdZcZpFq279jKA/S+h5ba/sZhMYCTkZxo3ikel3RZWYAV6CA==

Cc: Alejandro Vallejo <alejandro.vallejo@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Daniel P. Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Fri, 08 Nov 2024 16:30:09 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 11/8/24 10:19, Roger Pau Monné wrote: > On Fri, Nov 08, 2024 at 02:17:40PM +0100, Jan Beulich wrote: >> On 08.11.2024 13:42, Alejandro Vallejo wrote: >>> On Mon Nov 4, 2024 at 7:44 AM GMT, Jan Beulich wrote: >>>> On 01.11.2024 21:16, Stewart Hildebrand wrote: >>>>> +Daniel (XSM mention) >>>>> >>>>> On 10/28/24 13:02, Jan Beulich wrote: >>>>>> On 18.10.2024 22:39, Stewart Hildebrand wrote: >>>>>>> Add links between a VF's struct pci_dev and its associated PF struct >>>>>>> pci_dev. Move the calls to pci_get_pdev()/pci_add_device() down to avoid >>>>>>> dropping and re-acquiring the pcidevs_lock(). >>>>>>> >>>>>>> During PF removal, unlink VF from PF and mark the VF broken. As before, >>>>>>> VFs may exist without a corresponding PF, although now only with >>>>>>> pdev->broken = true. >>>>>>> >>>>>>> The hardware domain is expected to remove the associated VFs before >>>>>>> removing the PF. Print a warning in case a PF is removed with associated >>>>>>> VFs still present. >>>>>>> >>>>>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@xxxxxxx> >>>>>>> --- >>>>>>> Candidate for backport to 4.19 (the next patch depends on this one) >>>>>>> >>>>>>> v5->v6: >>>>>>> * move printk() before ASSERT_UNREACHABLE() >>>>>>> * warn about PF removal with VFs still present >>>>>> >>>>>> Hmm, maybe I didn't make this clear enough when commenting on v5: I >>>>>> wasn't >>>>>> just after an adjustment to the commit message. I'm instead actively >>>>>> concerned of the resulting behavior. Question is whether we can >>>>>> reasonably >>>>>> do something about that. >>>>> >>>>> Right. My suggestion then is to go back to roughly how it was done in >>>>> v4 [0]: >>>>> >>>>> * Remove the VFs right away during PF removal, so that we don't end up >>>>> with stale VFs. Regarding XSM, assume that a domain with permission to >>>>> remove the PF is also allowed to remove the VFs. We should probably also >>>>> return an error from pci_remove_device in the case of removing the PF >>>>> with VFs still present (and still perform the removals despite returning >>>>> an error). Subsequent attempts by a domain to remove the VFs would >>>>> return an error (as they have already been removed), but that's expected >>>>> since we've taken a stance that PF-then-VF removal order is invalid >>>>> anyway. >>>> >>>> Imo going back is not an option. >>>> >>>>> While the above is what I prefer, I just want to mention other options I >>>>> considered for the scenario of PF removal with VFs still present: >>>>> >>>>> * Increase the "scariness" of the warning message added in v6. >>>>> >>>>> * Return an error from pci_remove_device (while still removing only the >>>>> PF). We would be left with stale VFs in Xen. At least this would >>>>> concretely inform dom0 that Xen takes issue with the PF-then-VF removal >>>>> order. Subsequent attempts by a domain to remove VFs, however >>>>> (un)likely, would succeed. >>>> >>>> Returning an error in such a case is a possibility, but comes with the >>>> risk of confusion. Seeing such an error, a caller may itself assume the >>>> device still is there, and retry its (with or without having removed the >>>> VFs) removal at a later point. >>>> >>>>> * Return an error from pci_remove_device and keep the PF and VFs. This >>>>> is IMO the worst option because then we would have a stale PF in >>>>> addition to stale VFs. > > I'm thinking probably this is the least bad option, and just force the > owner of the PF to ensure there are no VFs left when removing the PF. > > What sense does it make anyway to allow removing a PF with VFs still > present? Not sure exactly what the owner of the PF will do before > calling pci_remove_device(), but it would seem to me the device should > be ready for unplug (so SR-IOV disabled). Calling pci_remove_device() > with VFs still active points to an error to do proper cleanup by the > owner of the PF. In normal, correct operation, right. The PF driver is indeed expected to disable SR-IOV (i.e. remove VFs) during its removal, prior to calling PHYSDEVOP_pci_device_remove for the PF. > Returning error from pci_remove_device() and doing nothing would seem > fine to me. There should be no stale PF or VFs in that case, as the > caller has been notified the device has failed to be removed, so > should treat the device as still present. But software has no way to guarantee there won't be a physical device removal. In test scenario #2 described in the first patch [1], the PF (the whole device, actually) has already been physically unplugged, and dom0 invokes PHYSDEVOP_pci_device_remove to inform Xen about it. [1] https://lore.kernel.org/xen-devel/20241018203913.1162962-2-stewart.hildebrand@xxxxxxx/ That said, test scenario #2 would only happen when a buggy PF driver failed to properly clean up the VFs before the PF. But the point is that returning an error does not guarantee there won't be a stale pdev in case of a buggy dom0. I guess as long as we trust the owner of the PF, this approach is fine.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.