|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error.
On March 18, 2016 6:20pm, <JBeulich@xxxxxxxx> wrote:
> >>> On 17.03.16 at 07:54, <quan.xu@xxxxxxxxx> wrote:
> > --- a/xen/drivers/passthrough/amd/iommu_init.c
> > +++ b/xen/drivers/passthrough/amd/iommu_init.c
> > @@ -1339,12 +1339,14 @@ static void invalidate_all_devices(void)
> > iterate_ivrs_mappings(_invalidate_all_devices);
> > }
> >
> > -void amd_iommu_suspend(void)
> > +int amd_iommu_suspend(void)
> > {
> > struct amd_iommu *iommu;
> >
> > for_each_amd_iommu ( iommu )
> > disable_iommu(iommu);
> > +
> > + return 0;
> > }
> >
> > void amd_iommu_resume(void)
> > @@ -1368,3 +1370,11 @@ void amd_iommu_resume(void)
> > invalidate_all_domain_pages();
> > }
> > }
> > +
> > +void amd_iommu_crash_shutdown(void)
> > +{
> > + struct amd_iommu *iommu;
> > +
> > + for_each_amd_iommu ( iommu )
> > + disable_iommu(iommu);
> > +}
>
> One of the two should clearly call the other - no need to have the same code
> twice.
>
Good idea.
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -182,7 +182,11 @@ void __hwdom_init iommu_hwdom_init(struct
> domain *d)
> > ((page->u.inuse.type_info & PGT_type_mask)
> > == PGT_writable_page) )
> > mapping |= IOMMUF_writable;
> > - hd->platform_ops->map_page(d, gfn, mfn, mapping);
> > + if ( hd->platform_ops->map_page(d, gfn, mfn, mapping) )
> > + printk(XENLOG_G_ERR
> > + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx)
> failed.\n",
> > + gfn, mfn);
> > +
>
> Printing one message here is certainly necessary, but what if the failure
> repeats
> for very many pages?
Yes, to me, it is ok, but I am open to your suggestion.
> Also %#lx instead of 0x%lx please, and a blank before the
> opening parenthesis.
>
OK, just check it:
..
"IOMMU: Map page gfn: %#lx (mfn: %#lx) failed.\n"
..
Right?
> > @@ -554,11 +555,24 @@ static void iommu_flush_all(void)
> > iommu = drhd->iommu;
> > iommu_flush_context_global(iommu, 0);
> > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0;
> > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb);
> > + rc = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb);
> > +
> > + if ( rc > 0 )
> > + {
> > + iommu_flush_write_buffer(iommu);
>
> Why is this needed all of the sudden?
As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my machine, and I
can find the following log message:
"""
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB.
(XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB.
"""
__iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should be called to
flush every IOMMU.
> (Note that if you did a more fine grained
> split, it might also be easier for you to note/ explain all the not directly
> related
> changes in the respective commit messages. Unless of course they fix actual
> bugs, in which case they should be split out anyway; such individual fixes
> would
> also likely have a much faster route to commit, relieving you earlier from the
> burden of at least some of the changes you have to carry and re-base.)
>
> > + rc = 0;
> > + }
> > + else if ( rc < 0 )
> > + {
> > + printk(XENLOG_G_ERR "IOMMU: IOMMU flush all failed.\n");
> > + break;
> > + }
>
> Is a log message really advisable here?
>
To me, It looks tricky too. I was struggling to make decision. For scheme B, I
would try to do as below:
if ( iommu_flush_all() )
printk("... nnn ...");
but there are 4 function calls, if so, to me, it looks redundant.
Or, could I ignore the print out for iommu_flush_all() failed?
> > -static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long
> > gfn,
> > +static int __intel_iommu_iotlb_flush(struct domain *d, unsigned long
> > +gfn,
>
> While I'm not VT-d maintainer, I think changes like this would be a good
> opportunity to also drop the stray double underscores: You need to touch all
> callers anyway.
>
I think this is optional.
> > @@ -584,37 +599,40 @@ static void __intel_iommu_iotlb_flush(struct
> > domain *d, unsigned long gfn,
> > continue;
> >
> > if ( page_count != 1 || gfn == INVALID_GFN )
> > - {
> > - if ( iommu_flush_iotlb_dsi(iommu, iommu_domid,
> > - 0, flush_dev_iotlb) )
> > - iommu_flush_write_buffer(iommu);
> > - }
> > + rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
> > + 0, flush_dev_iotlb);
> > else
> > + rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
> > + (paddr_t)gfn <<
> PAGE_SHIFT_4K, 0,
> > + !dma_old_pte_present,
> > + flush_dev_iotlb);
> > + if ( rc > 0 )
> > {
> > - if ( iommu_flush_iotlb_psi(iommu, iommu_domid,
> > - (paddr_t)gfn << PAGE_SHIFT_4K,
> PAGE_ORDER_4K,
>
> Note how this used PAGE_ORDER_4K so far?
Sorry, this is a rebasing mistake.
>
> > - !dma_old_pte_present, flush_dev_iotlb) )
> > - iommu_flush_write_buffer(iommu);
> > + iommu_flush_write_buffer(iommu);
>
> Same question again: Why is this all of the sudden needed on both paths?
>
The same as above question. Hold on first.
> > @@ -622,7 +640,7 @@ static void dma_pte_clear_one(struct domain
> *domain, u64 addr)
> > if ( pg_maddr == 0 )
> > {
> > spin_unlock(&hd->arch.mapping_lock);
> > - return;
> > + return -ENOMEM;
> > }
>
> addr_to_dma_page_maddr() gets called with "alloc" being false, so there can't
> be any memory allocation failure here. There simply is nothing to do in this
> case.
>
I copy it from iommu_map_page().
Good, then the error of iommu_unmap_page() looks only from flush (the crash is
at least obvious), then error handling can be lighter weight--
We may return an error, but don't roll back the failed operation.
Right?
> > -void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map)
> > +int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map)
> > {
> > u32 id;
> > + int rc = 0;
> >
> > id = pci_conf_read32(0, 0, 0, 0, 0);
> > if ( IS_CTG(id) )
> > {
> > /* quit if ME does not exist */
> > if ( pci_conf_read32(0, 0, 3, 0, 0) == 0xffffffff )
> > - return;
> > + return -ENOENT;
>
> Is this really an error? IOW, do all systems which satisfy IS_CTG() have such
> a
> device?
>
To be honest, I didn't know much about me_wifi_quirk.
Now, IMO I don't need to deal with me_wifi_quirk().
Quan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |