[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6 05/13] vpci: add hooks for PCI device assign/de-assign
On 08.02.2022 11:22, Oleksandr Andrushchenko wrote: > > > On 08.02.22 12:09, Jan Beulich wrote: >> On 08.02.2022 10:55, Oleksandr Andrushchenko wrote: >>> >>> On 08.02.22 11:44, Jan Beulich wrote: >>>> On 08.02.2022 10:27, Oleksandr Andrushchenko wrote: >>>>> On 08.02.22 11:13, Jan Beulich wrote: >>>>>> On 08.02.2022 09:32, Oleksandr Andrushchenko wrote: >>>>>>> On 07.02.22 18:28, Jan Beulich wrote: >>>>>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote: >>>>>>>>> @@ -1507,6 +1511,8 @@ static int assign_device(struct domain *d, u16 >>>>>>>>> seg, u8 bus, u8 devfn, u32 flag) >>>>>>>>> pci_to_dev(pdev), flag); >>>>>>>>> } >>>>>>>>> >>>>>>>>> + rc = vpci_assign_device(d, pdev); >>>>>>>>> + >>>>>>>>> done: >>>>>>>>> if ( rc ) >>>>>>>>> printk(XENLOG_G_WARNING "%pd: assign (%pp) failed >>>>>>>>> (%d)\n", >>>>>>>> There's no attempt to undo anything in the case of getting back an >>>>>>>> error. ISTR this being deemed okay on the basis that the tool stack >>>>>>>> would then take whatever action, but whatever it is that is supposed >>>>>>>> to deal with errors here wants spelling out in the description. >>>>>>> Why? I don't change the previously expected decision and implementation >>>>>>> of the assign_device function: I use error paths as they were used >>>>>>> before >>>>>>> for the existing code. So, I see no clear reason to stress that the >>>>>>> existing >>>>>>> and new code relies on the toolstack >>>>>> Saying half a sentence on this is helping review. >>>>> Ok >>>>>>>> What's important is that no caller up the call tree may be left with >>>>>>>> the impression that the device is still owned by the original >>>>>>>> domain. With how you have it, the device is going to be owned by the >>>>>>>> new domain, but not really usable. >>>>>>> This is not true: vpci_assign_device will call vpci_deassign_device >>>>>>> internally if it fails. So, the device won't be assigned in this case >>>>>> No. The device is assigned to whatever pdev->domain holds. Calling >>>>>> vpci_deassign_device() there merely makes sure that the device will >>>>>> have _no_ vPCI data and hooks in place, rather than something >>>>>> partial. >>>>> So, this patch is only dealing with vpci assign/de-assign >>>>> And it rolls back what it did in case of a failure >>>>> It also returns rc in assign_device to signal it has failed >>>>> What else is expected from this patch?? >>>> Until now if assign_device() returns an error, this tells the caller >>>> that the device did not change ownership; >>> Not sure this is the case: >>> if ( (rc = iommu_call(hd->platform_ops, assign_device, d, devfn, >>> pci_to_dev(pdev), flag)) ) >>> iommu_call can leave the new ownership even now without >>> vpci_assign_device. >> Did you check the actual hook functions for when exactly the ownership >> change happens. For both VT-d and AMD it is the last thing they do, >> when no error can occur anymore. > This functionality does not exist for Arm yet, so this is up to the > future series to add that. > > WRT to the existing code: > > static int amd_iommu_assign_device(struct domain *d, u8 devfn, > struct pci_dev *pdev, > u32 flag) > { > if ( !rc ) > rc = reassign_device(pdev->domain, d, devfn, pdev); <<<<< this will > set pdev->domain > > if ( rc && !is_hardware_domain(d) ) > { > int ret = amd_iommu_reserve_domain_unity_unmap( > d, ivrs_mappings[req_id].unity_map); > > if ( ret ) > { > printk(XENLOG_ERR "AMD-Vi: " > "unity-unmap for %pd/%04x:%02x:%02x.%u failed (%d)\n", > d, pdev->seg, pdev->bus, > PCI_SLOT(devfn), PCI_FUNC(devfn), ret); > domain_crash(d); > } > So.... > > This is IMO wrong in the first place to let IOMMU code assign pdev->domain. > This is something that needs to be done by the PCI code itself and > not relying on each IOMMU callback implementation >> >> My understanding is that the roll-back is >>> expected to be performed by the toolstack and vpci_assign_device >>> doesn't prevent that by returning rc. Even more, before we discussed >>> that it would be good for vpci_assign_device to try recovering from >>> a possible error early which is done by calling vpci_deassign_device >>> internally. >> Yes, but that's only part of it. It at least needs considering what >> effects have resulted from operations prior to vpci_assign_device(). > Taking into account the code snippet above: what is your expectation > from this patch with this respect? You did note the domain_crash() in there, didn't you? The snippet above still matches the "device not assigned to an alive DomU" criteria (which can be translated to "no exposure of a device to an untrusted entity in case of error"). Such domain_crash() uses aren't nice, and I'd prefer to see them go away, but said property needs to be retained with any alternative solutions. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |