[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v13 3/3] IOMMU: fix vt-d Device-TLB flush timeout issue



On July 04, 2016 2:16 PM, Tian, Kevin <kevin.tian@xxxxxxxxx> wrote:
> > From: Xu, Quan
> > Sent: Wednesday, June 29, 2016 2:00 PM
> >
> > From: Quan Xu <quan.xu@xxxxxxxxx>
> >
> > If Device-TLB flush timed out, we hide the target ATS device
> > immediately. By hiding the device, we make sure it can't be assigned
> > to any domain any longer (see device_assigned).
> >
> > Signed-off-by: Quan Xu <quan.xu@xxxxxxxxx>
> >
> > CC: Jan Beulich <jbeulich@xxxxxxxx>
> > CC: Kevin Tian <kevin.tian@xxxxxxxxx>
> > CC: Feng Wu <feng.wu@xxxxxxxxx>
> >
> > ---
> > v13:
> >    1. drop domain crash logic, which is added to the vendor
> >       independent layer in patch #2.
> >    2. rename dev_invalidate_iotlb_timeout() to
> iommu_dev_iotlb_flush_timeout()
> >       and move it to the vendor independent layer.
> > ---
> >  xen/drivers/passthrough/iommu.c       | 21 +++++++++++++
> >  xen/drivers/passthrough/pci.c         |  6 ++--
> >  xen/drivers/passthrough/vtd/extern.h  |  5 ++--
> > xen/drivers/passthrough/vtd/qinval.c  | 56
> > +++++++++++++++++++++++++++--------
> >  xen/drivers/passthrough/vtd/x86/ats.c | 11 ++-----
> >  xen/include/xen/iommu.h               |  3 ++
> >  xen/include/xen/pci.h                 |  1 +
> >  7 files changed, 76 insertions(+), 27 deletions(-)
> >
> > diff --git a/xen/drivers/passthrough/iommu.c
> > b/xen/drivers/passthrough/iommu.c index d793f5d..5db8ae6 100644
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -361,6 +361,27 @@ int iommu_iotlb_flush_all(struct domain *d)
> >      return rc;
> >  }
> >
> > +void iommu_dev_iotlb_flush_timeout(struct domain *d,
> > +                                   struct pci_dev *pdev) {
> > +    pcidevs_lock();
> > +
> > +    ASSERT(pdev->domain);
> > +    if ( d != pdev->domain )
> > +        return;
> 
> return w/o releasing the lock!
> 
Yes, I really need releasing the lock before return.

> and is above scenario actually possible (a flush timeout is captured when the
> device doesn't belong to previous domain)? If not, better to move the
> condition into ASSERT.

IMO, this is possible.
  -- not all of call trees of device iotlb flush are under pcidevs_lock, (.i.e  
...--iommu_iotlb_flush()-- xenmem_add_to_physmap()... )
  -- In extreme cases , the domain may has been freed or the device may has 
been detached or even attached to another domain.
That's also why to introduce a domain point here.

> 
> > +
> > +    list_del(&pdev->domain_list);
> > +    pdev->domain = NULL;
> > +    pci_hide_existing_device(pdev);
> > +    if ( !d->is_shutting_down && printk_ratelimit() )
> > +        printk(XENLOG_ERR
> > +               "dom%d: ATS device %04x:%02x:%02x.%u flush failed\n",
> > +               d->domain_id, pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > +               PCI_FUNC(pdev->devfn));
> > +
> > +    pcidevs_unlock();
> 
> please move above warning out of the lock.
> 

I think I'm better leave it as is.

as I use 'pdev' to print information, as similar as pci_release_devices().
If I use seg, bus, devfn variables directly, instead of 'pdev', I agree to move 
out of the lock, as similar as:

iommu_do_pci_domctl()
{
     case XEN_DOMCTL_assign_device... 
}

.... correct me if I am not right.

Quan




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.