[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.
> On January 25, 2016 at 10:09pm, <JBeulich@xxxxxxxx> wrote: > >>> On 22.01.16 at 16:57, <quan.xu@xxxxxxxxx> wrote: > >> On January 22, 2016 at 12:31am, <JBeulich@xxxxxxxx> wrote: > >> >>> On 21.01.16 at 17:16, <quan.xu@xxxxxxxxx> wrote: > >> >> On January 20, 2016 at 7:29 pm, <JBeulich@xxxxxxxx> wrote: > >> >> >>> On 20.01.16 at 11:26, <quan.xu@xxxxxxxxx> wrote: > >> >> >> On January 15, 2016 at 9:10, <JBeulich@xxxxxxxx> wrote: > >> >> >> >>> On 23.12.15 at 09:25, <quan.xu@xxxxxxxxx> wrote: > >> >> >> Also note that unused slots hold zero, i.e. there's at least a > >> >> >> theoretical > >> >> > risk of > >> >> >> problems here when you don't also look at > >> >> >> iommu->domid_bitmap. > >> >> >> > >> >> > I am not clear how to fix this point. Do you have good idea? > >> >> > Add a lock to 'iommu->domid_bitmap'? > >> >> > >> >> How would a lock help avoiding mistaking an unused slot to mean Dom0? > >> >> As already suggested - I think you simply need to consult the > >> >> bitmap along with > >> > the > >> >> domain ID array. > >> >> > >> > > >> > Try to get domain id with iommu->domid_map[did] ? > >> > >> ??? > >> > > +if ( iommu->domid_map ) > > + d = rcu_lock_domain_by_id(iommu->domid_map[did]); > > > > Is it right? > > I don't see what this changes. Again - what your code has been lacking so far > is > some mechanism to guarantee that what you read from domid_map[] is > actually a valid domain ID. I can only once again point your attention to > domid_bitmap, which afaics gives you exactly that valid/invalid indication. > Jan, Sorry, I was confused about this point as I didn't understand the domid_bitmap. I printed out the iommu->domid_bitmap[did], which was tricky to me. Now I get it. As you mentioned , I simply need to consult the bitmap along with the domain ID array. +If ( test_bit(did, iommu->domid_bitmap) && iommu->domid_map[did] >= 0 ) + d = rcu_lock_domain_by_id(iommu->domid_map[did]); Is it right now? > >> >> >> > + { > >> >> >> > + list_del(&pdev->domain_list); > >> >> >> > >> >> >> This should happen under pcidevs_lock - you need to either > >> >> >> acquire it or > >> >> >> ASSERT() it being held. > >> >> >> > >> >> > > >> >> > I should check whether it is under pcidevs_lock -- with > >> >> > spin_is_locked(&pcidevs_lock) > >> >> > If it is not under pcidevs_lock, I should acquire it. > >> >> > I think ASSERT() is not a good idea. Hypervisor acquires this > >> >> > lock and then remove the resource. > >> >> > >> >> I don't understand this last sentence. > >> >> > >> > For example: in > >> > pci_remove_device() > >> > { > >> > ... > >> > spin_lock(&pcidevs_lock); > >> > .. > >> > iommu_remove_device().. > >> > .. > >> > spin_unlock(&pcidevs_lock); > >> > } > >> > > >> > Device-TLB is maybe flush error in iommu_remove_device().. > >> > Then it is under pcidevs_lock.. > >> > In this case, I need to check whether it is under pcidevs_lock. > >> > If not, I need to acquire the pcidevs_lock. > >> > >> Ah, okay. But you can't use spin_is_locked() for that purpose. > >> > > If I introduce a new parameter 'lock'. > > + int lock = spin_is_locked(&pcidevs_lock); > > > > > > + if ( !lock ) > > + spin_lock(&pcidevs_lock); > > ... > > + if ( !lock ) > > + spin_unlock(&pcidevs_lock); > > > > Is it right? > > Jan, do you have some better idea? > > If indeed different locking state is possible for different call trees, Indeed different. For example, It is _not_under_ lock for the following call tree: $ flush_iotlb_qi()--- iommu_flush_iotlb_psi() -- __intel_iommu_iotlb_flush() --intel_iommu_iotlb_flush() --iommu_iotlb_flush() --xenmem_add_to_physmap()--do_memory_op() It is _under_ lock for the following call tree: $flush_iotlb_qi()--iommu_flush_iotlb_dsi()--domain_context_unmap_one()--domain_context_unmap()--reassign_device_ownership()--deassign_device()-iommu_do_pci_domctl() > then I'm > afraid you won't get around passing down flags to indicate whether the needed > lock is already being held. > Agreed. At first, I am open for any solution. pcidevs_lock is quite a big lock. For this point, it looks much better to add a new flag to delay hiding device. I am also afraid that it may raise further security issues. Jan, thanks!! -Quan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |