[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 02/11] vpci: cancel pending map/unmap on vpci removal
Sorry, I've been quite busy with other staff. On Thu, Nov 18, 2021 at 01:48:06PM +0000, Oleksandr Andrushchenko wrote: > > > On 18.11.21 15:25, Jan Beulich wrote: > > On 18.11.2021 10:32, Oleksandr Andrushchenko wrote: > >> > >> On 18.11.21 11:15, Jan Beulich wrote: > >>> On 18.11.2021 09:54, Oleksandr Andrushchenko wrote: > >>>> On 18.11.21 10:36, Jan Beulich wrote: > >>>>> On 18.11.2021 08:49, Oleksandr Andrushchenko wrote: > >>>>>> On 17.11.21 10:28, Jan Beulich wrote: > >>>>>>> On 05.11.2021 07:56, Oleksandr Andrushchenko wrote: > >>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> > >>>>>>>> > >>>>>>>> When a vPCI is removed for a PCI device it is possible that we have > >>>>>>>> scheduled a delayed work for map/unmap operations for that device. > >>>>>>>> For example, the following scenario can illustrate the problem: > >>>>>>>> > >>>>>>>> pci_physdev_op > >>>>>>>> pci_add_device > >>>>>>>> init_bars -> modify_bars -> defer_map -> > >>>>>>>> raise_softirq(SCHEDULE_SOFTIRQ) > >>>>>>>> iommu_add_device <- FAILS > >>>>>>>> vpci_remove_device -> xfree(pdev->vpci) > >>>>>>>> > >>>>>>>> leave_hypervisor_to_guest > >>>>>>>> vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci > >>>>>>>> == NULL > >>>>>>>> > >>>>>>>> For the hardware domain we continue execution as the worse that > >>>>>>>> could happen is that MMIO mappings are left in place when the > >>>>>>>> device has been deassigned > >>>>>>>> > >>>>>>>> For unprivileged domains that get a failure in the middle of a vPCI > >>>>>>>> {un}map operation we need to destroy them, as we don't know in which > >>>>>>>> state the p2m is. This can only happen in vpci_process_pending for > >>>>>>>> DomUs as they won't be allowed to call pci_add_device. > >>>>>>>> > >>>>>>>> Signed-off-by: Oleksandr Andrushchenko > >>>>>>>> <oleksandr_andrushchenko@xxxxxxxx> > >>>>>>> Thinking about it some more, I'm not convinced any of this is really > >>>>>>> needed in the presented form. > >>>>>> The intention of this patch was to handle error conditions which are > >>>>>> abnormal, e.g. when iommu_add_device fails and we are in the middle > >>>>>> of initialization. So, I am trying to cancel all pending work which > >>>>>> might > >>>>>> already be there and not to crash. > >>>>> Only Dom0 may be able to prematurely access the device during "add". > >>>>> Yet unlike for DomU-s we generally expect Dom0 to be well-behaved. > >>>>> Hence I'm not sure I see the need for dealing with these. > >>>> Probably I don't follow you here. The issue I am facing is Dom0 > >>>> related, e.g. Xen was not able to initialize during "add" and thus > >>>> wanted to clean up the leftovers. As the result the already > >>>> scheduled work crashes as it was not neither canceled nor interrupted > >>>> in some safe manner. So, this sounds like something we need to take > >>>> care of, thus this patch. > >>> But my point was the question of why there would be any pending work > >>> in the first place in this case, when we expect Dom0 to be well-behaved. > >> I am not saying Dom0 misbehaves here. This is my real use-case > >> (as in the commit message): > >> > >> pci_physdev_op > >> pci_add_device > >> init_bars -> modify_bars -> defer_map -> > >> raise_softirq(SCHEDULE_SOFTIRQ) > >> iommu_add_device <- FAILS > >> vpci_remove_device -> xfree(pdev->vpci) > >> > >> leave_hypervisor_to_guest > >> vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci == > >> NULL > > First of all I'm sorry for having lost track of that particular case in > > the course of the discussion. > No problem, I see on the ML how much you review every day... > > > > I wonder though whether that's something we really need to take care of. > > At boot (on x86) modify_bars() wouldn't call defer_map() anyway, but > > use apply_map() instead. I wonder whether this wouldn't be appropriate > > generally in the context of init_bars() when used for Dom0 (not sure > > whether init_bars() would find some form of use for DomU-s as well). > > This is even more so as it would better be the exception that devices > > discovered post-boot start out with memory decoding enabled (which is a > > prereq for modify_bars() to be called in the first place). > Well, first of all init_bars is going to be used for DomUs as well: > that was discussed previously and is reflected in this series. > > But the real use-case for the deferred mapping would be the one > from PCI_COMMAND register write emulation: > > void vpci_cmd_write(const struct pci_dev *pdev, unsigned int reg, > uint32_t cmd, void *data) > { > [snip] > modify_bars(pdev, cmd, false); > > which in turn calls defer_map. > > I believe Roger did that for a good reason not to stall Xen while doing > map/unmap (which may take quite some time) and moved that to > vpci_process_pending and SOFTIRQ context. Indeed. In the physdevop failure case this comes from an hypercall context, so maybe you could do the mapping in place using hypercall continuations if required. Not sure how complex that would be, compared to just deferring to guest entry point and then dealing with the possible cleanup on failure. Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |