|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v9 2/3] VT-d: wrap a _sync version for all VT-d flush interfaces
On April 05, 2016 5:35pm, Jan Beulich <JBeulich@xxxxxxxx> wrote:
> >>> On 01.04.16 at 16:47, <quan.xu@xxxxxxxxx> wrote:
> > The dev_invalidate_iotlb() scans ats_devices list to flush ATS
> > devices, and the invalidate_sync() is put after dev_invalidate_iotlb()
> > to synchronize with hardware for flush status. If we assign multiple
> > ATS devices to a domain, the flush status is about all these multiple
> > ATS devices. Once flush timeout expires, we couldn't find out which
> > one is the buggy ATS device.
>
> Is that true? Or is that just a limitation of our implementation?
>
IMO, both.
I hope vt-d maintainers help me double check it.
> > Then, The invalidate_sync() variant (We need to pass down the device's
> > SBDF to hide the ATS device) is put within dev_invalidate_iotlb() to
> > synchronize for the flush status one by one.
>
> I don't think this is stating current state of things. So ...
>
> > If flush timeout expires,
> > we could find out the buggy ATS device and hide it. However, for other
> > VT-d flush interfaces, the invalidate_sync() is still put after at present.
> > This is inconsistent.
>
> ... taken together, what is inconsistent here needs to be described better,
> as well
> as what it is you do to eliminate the inconsistency. Note that you should not
> refer back (or imply knowledge of) the previous discussion on the earlier
> version.
> In any of that discussion is useful here, you need to summarize it instead.
>
I will continue to summarize it and send out later.
> > --- a/xen/drivers/passthrough/vtd/extern.h
> > +++ b/xen/drivers/passthrough/vtd/extern.h
> > @@ -61,6 +61,8 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16
> > did,
> >
> > int qinval_device_iotlb(struct iommu *iommu,
> > u32 max_invs_pend, u16 sid, u16 size, u64
> > addr);
> > +int qinval_device_iotlb_sync(struct iommu *iommu, u32 max_invs_pend,
> > + u16 sid, u16 size, u64 addr);
>
> So are then both functions needed to be externally accessible?
> That would seem contrary to the last paragraph of the patch description.
>
I was aware of this. I'd better make the qinval_device_iotlb() a static one in
next v10.
[...]
> > +static int queue_invalidate_context_sync(struct iommu *iommu,
>
> __must_check?
>
Agreed.
[...]
> > +{
> > + queue_invalidate_context(iommu, did, source_id,
> > + function_mask, granu);
> > +
> > + return invalidate_sync(iommu);
> > +}
>
> Further down you replace the only call to
> queue_invalidate_context() - why keep both functions instead of just making
> the
> existing one do the sync? (That would the likely also apply to
> qinval_device_iotlb() and others below.)
>
It is optional.
I think:
1. in the long term, we may need no _sync version.
2. At least, the current wrap looks good to me. e.g. queue_invalidate_context()
is for context-cache Invalidate Descriptor, and the
invalidate_sync() is for Invalidation Wait Descriptor. It is much clearer.
> > @@ -338,23 +365,24 @@ static int flush_iotlb_qi(
> >
> > if ( qi_ctrl->qinval_maddr != 0 )
> > {
> > - int rc;
> > -
> > /* use queued invalidation */
> > if (cap_write_drain(iommu->cap))
> > dw = 1;
> > if (cap_read_drain(iommu->cap))
> > dr = 1;
> > /* Need to conside the ih bit later */
> > - queue_invalidate_iotlb(iommu,
> > - type >>
> DMA_TLB_FLUSH_GRANU_OFFSET, dr,
> > - dw, did, size_order, 0, addr);
> > + ret = queue_invalidate_iotlb_sync(iommu,
> > + type >> DMA_TLB_FLUSH_GRANU_OFFSET, dr, dw, did,
> > + size_order, 0, addr);
> > +
> > + /* TODO: Timeout error handling to be added later */
>
> As of today queue_invalidate_wait() panics, so this comment is not very
> helpful
> as there is not timeout that could possibly be detected here.
>
Okay, I will drop it.
> > + if ( ret )
> > + return ret;
> > +
> > if ( flush_dev_iotlb )
> > ret = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
> > - rc = invalidate_sync(iommu);
> > - if ( !ret )
> > - ret = rc;
> > }
>
> I think leaving the existing logic as is would be better - best effort
> invalidation
> even when an error has occurred.
>
I have an open:
As vt-d spec(:Queued Invalidation Ordering Considerations) said,
1. If the Fence(FN) flag is 1 in a inv_wait_dsc, hardware must execute
descriptors following the inv_wait_dsc only after wait command is completed.
2. when a Device-TLB invalidation timeout is detected, hardware must not
complete any pending inv_wait_dsc commands.
In current code, the Fence(FN) is always 1.
if a Device-TLB invalidation timeout is detected, this additional inv_wait_dsc
is not completed.
__iiuc__,
the new coming descriptors, in that queue, _might_ be not executed any more,
waiting for this additional inv_wait_dsc which is not completed.
is it true?
> > --- a/xen/drivers/passthrough/vtd/x86/ats.c
> > +++ b/xen/drivers/passthrough/vtd/x86/ats.c
> > @@ -118,7 +118,6 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16
> did,
> > {
> > u16 sid = PCI_BDF2(pdev->bus, pdev->devfn);
> > bool_t sbit;
> > - int rc = 0;
> >
> > /* Only invalidate devices that belong to this IOMMU */
> > if ( pdev->iommu != iommu )
> > @@ -134,8 +133,8 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16
> did,
> > /* invalidate all translations: sbit=1,bit_63=0,bit[62:12]=1 */
> > sbit = 1;
> > addr = (~0UL << PAGE_SHIFT_4K) & 0x7FFFFFFFFFFFFFFF;
> > - rc = qinval_device_iotlb(iommu, pdev->ats_queue_depth,
> > - sid, sbit, addr);
> > + ret = qinval_device_iotlb_sync(iommu,
> pdev->ats_queue_depth,
> > + sid, sbit, addr);
> > break;
> > case DMA_TLB_PSI_FLUSH:
> > if ( !device_in_domain(iommu, pdev, did) ) @@ -154,16
> > +153,13 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
> > addr |= (((u64)1 << (size_order - 1)) - 1) <<
> PAGE_SHIFT_4K;
> > }
> >
> > - rc = qinval_device_iotlb(iommu, pdev->ats_queue_depth,
> > - sid, sbit, addr);
> > + ret = qinval_device_iotlb_sync(iommu,
> pdev->ats_queue_depth,
> > + sid, sbit, addr);
> > break;
> > default:
> > dprintk(XENLOG_WARNING VTDPREFIX, "invalid vt-d flush
> type\n");
> > return -EOPNOTSUPP;
> > }
> > -
> > - if ( !ret )
> > - ret = rc;
> > }
>
> The removal of "rc" (which btw is unrelated to the purpose of your
> patch) means that if an earlier iteration encountering an error is followed by
> later successful ones, no error would get reported to the caller. Hence this
> part
> of the change needs to be undone.
>
Agreed.
I tried to drop rc, as the ret was always zero in previous code.
Quan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |