[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 2/2] VT-d: Fix vt-d flush timeout issue.
>On 14.12.2015 at 5:28pm, <JBeulich@xxxxxxxx> wrote: > >>> On 12.12.15 at 14:21, <quan.xu@xxxxxxxxx> wrote: > > --- a/xen/drivers/passthrough/vtd/iommu.c > > +++ b/xen/drivers/passthrough/vtd/iommu.c > > @@ -1890,6 +1890,9 @@ static int intel_iommu_add_device(u8 devfn, struct > pci_dev *pdev) > > if ( !pdev->domain ) > > return -EINVAL; > > > > + if ( is_pdev_unassignable(pdev) ) > > + return -EACCES; > > Is this case possible at all (i.e. a newly added device being unassignable)? > IMO, it is possible, and I have checked and printed the invoke flow when Xen init. iommu_setup() / iommu_hwdom_init() are called before pci_add_device() when Xen init. If it flushes error in iommu_setup() / iommu_hwdom_init(), it will mark device as unassignable. So it is possible. right? > > --- a/xen/drivers/passthrough/vtd/qinval.c > > +++ b/xen/drivers/passthrough/vtd/qinval.c > > @@ -27,12 +27,58 @@ > > #include "dmar.h" > > #include "vtd.h" > > #include "extern.h" > > +#include "../ats.h" > > > > static int __read_mostly iommu_qi_timeout_ms = 1; > > integer_param("iommu_qi_timeout_ms", iommu_qi_timeout_ms); > > > > #define IOMMU_QI_TIMEOUT (iommu_qi_timeout_ms * MILLISECS(1)) > > > > +void invalidate_timeout(struct iommu *iommu) { > > + struct domain *d; > > + unsigned long nr_dom, i; > > + struct pci_dev *pdev; > > + > > + nr_dom = cap_ndoms(iommu->cap); > > + i = find_first_bit(iommu->domid_bitmap, nr_dom); > > + while ( i < nr_dom ) { > > + d = rcu_lock_domain_by_id(iommu->domid_map[i]); > > + ASSERT(d); > > + > > + /* Mark the devices as unassignable. */ > > + for_each_pdev(d, pdev) > > + mark_pdev_unassignable(pdev); > > + if ( !is_hardware_domain(d) ) > > + domain_kill(d); > > DYM domain_crash() here? > Agreed. > > +void device_tlb_invalidate_timeout(struct iommu *iommu, u16 did, > > + u16 seg, u8 bus, u8 devfn) { > > + struct domain *d; > > + struct pci_dev *pdev; > > + > > + d = rcu_lock_domain_by_id(iommu->domid_map[did]); > > + ASSERT(d); > > + for_each_pdev(d, pdev) > > + if ( (pdev->seg == seg) && > > + (pdev->bus == bus) && > > + (pdev->devfn == devfn) ) > > + { > > + mark_pdev_unassignable(pdev); > > + break; > > + } > > + > > + if ( !is_hardware_domain(d) ) > > + domain_kill(d); > > + rcu_unlock_domain(d); > > +} > > Except for the variable declarations, indentation is broken for the entire > function. > I will modify it in v4. > > @@ -262,6 +308,14 @@ static int __iommu_flush_iec(struct iommu *iommu, > > u8 granu, u8 im, u16 iidx) > > > > queue_invalidate_iec(iommu, granu, im, iidx); > > ret = invalidate_sync(iommu); > > + > > + if ( ret == -ETIMEDOUT ) > > + { > > + invalidate_timeout(iommu); > > + dprintk(XENLOG_WARNING VTDPREFIX, > > + "IEC flush timeout.\n"); > > + return ret; > > + } > > /* > > Considering the recurring pattern, wouldn't it be better for > invalidate_sync() to invoke invalidate_timeout() (at once making sure no > current > or future caller misses the need to do so)? > Now invalidate_sync() is a common function for iec/iotlb/context/device-tlb invalidation. It is different to deal with the flush error. For device-tlb, it needs some parameters of device, such device seg/bus/devfn. But iec/iotlb/context don't need these parameters. Could I introduce device_tlb_invalidate_sync(struct iommu *iommu, u16 did, u16 seg, u8 bus, u8 devfn) for device-tlb flush? invalidate_sync(struct iommu *iommu) is for iec/iotlb/context. > Also please insert the blank line at the end of your additions, and no > trailing full > stops in log messages please. > I will modify it in v4. > > @@ -88,6 +89,16 @@ struct pci_dev { > > #define for_each_pdev(domain, pdev) \ > > list_for_each_entry(pdev, &(domain->arch.pdev_list), domain_list) > > > > +static inline void mark_pdev_unassignable(struct pci_dev *pdev) { > > + pdev->info.is_unassignable = 1; > > +} > > + > > +static inline bool_t is_pdev_unassignable(const struct pci_dev *pdev) > > +{ > > + return pdev->info.is_unassignable; } > > Are you aware that we already have a mechanism to prevent assignment (via > pci_{ro,hide}_device())? I think at the very least this check should consider > both > variants. Whether fully using the existing mechanism for your purpose is > feasible I can't immediately tell (since the ownership change may be > problematic at the points where you want the flagging to happen). pci_{ro,hide}_device() is dangerous, and it makes xen crash when I tried it. for this case, IMO I think a flag is a better solution. -Quan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |