[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.
> On January 27, 2016 10:29pm, <JBeulich@xxxxxxxx> wrote: > >>> On 27.01.16 at 15:13, <quan.xu@xxxxxxxxx> wrote: > >> On January 27, 2016 at 9:15pm, <JBeulich@xxxxxxxx> wrote: > >> >>> On 27.01.16 at 13:38, <quan.xu@xxxxxxxxx> wrote: > >> >> On January 27, 2016 at 7:24pm, <JBeulich@xxxxxxxx> wrote: > >> >> >>> On 27.01.16 at 12:09, <quan.xu@xxxxxxxxx> wrote: > >> >> >> On January 27, 2016 at 6:48am, <Tian, Kevin> wrote: > >> >> >> > From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > >> >> >> > Sent: Tuesday, January 26, 2016 11:53 PM > >> >> > > >> >> > > >> >> >> > Once again: Before getting started, please assess which route > >> >> >> > is going to be the better one. Remember that we had already > >> >> >> > discussed and put aside some form of deferring the hiding of > >> >> >> > devices, so if you come back with a patch doing that again, > >> >> >> > you'll have to be able to explain why the alternative(s) are worse. > >> >> >> > > >> >> >> > >> >> >> Quan, could you list pros/cons of those alternatives based on > >> >> >> discussion so > >> >> far? > >> >> >> Then we can decide which way should be done before you go to > >> >> >> actual > >> >> coding. > >> >> >> Earlier suggestion on hiding device immediately is under the > >> >> >> assumption that all locks have been held. If this part becomes > >> >> >> too complex, and you can explain clearly that deferring the > >> >> >> hiding action doesn't lead to any race condition, then people > >> >> >> can see why you are > >> >> proposing defer again. > >> >> > > >> >> > > >> >> > The following are pros/cons of those alternatives. It is also > >> >> > why I propose defer again. > >> >> > > >> >> > -- -- > >> >> > 1. Hiding the devices immediately > >> >> > Pros: > >> >> > * it makes whatever changes are ASAP after the Device-TLB > >> >> > flush > >> error. > >> >> > > >> >> > Cons: > >> >> > * It may break the code path. > >> >> > * It may lead to any race condition. > >> >> > * Hiding the devices immediately is under the assumption > >> >> > that all > >> > locks > >> >> have been held. > >> >> > Different locking state is possible for different call trees. > >> >> > This > >> > part > >> >> becomes too complex. > >> >> > >> >> So you just repeat what you've already said before. "This part > >> >> becomes too complex" you say without any kind of proof, yet that's > >> >> what we need to understand whether the alternative of doing the > >> >> locking correctly really is > >> > this > >> >> bad (and I continue to not see why it would). > >> > > >> > > >> > Such as pcidevs_lock: > >> > > >> > 1. as I mentioned, it is indeed different locking state is possible > >> > for different call trees of flush Device-TLB. When Device-TLB flush > >> > is error, It is really challenge to judge whether to acquire the > >> > pcidevs_lock or > >> not. > >> > > >> > For example, > >> > *It is _not_under_ lock for the following call tree: > >> > $ flush_iotlb_qi()--- iommu_flush_iotlb_psi() -- > >> > __intel_iommu_iotlb_flush() > >> > --intel_iommu_iotlb_flush() --iommu_iotlb_flush() > >> > --xenmem_add_to_physmap()--do_memory_op() > >> > > >> > *It is _under_ lock for the following call tree: > >> > $flush_iotlb_qi()--iommu_flush_iotlb_dsi()--domain_context_unmap_on > >> > e() > >> > --domain_con > >> > text_unmap()--reassign_device_ownership()--deassign_device()-iommu_ > >> > do_ > >> > pci_domctl() > >> > > >> > 2. if I try to acquire the pcidevs_lock for some _not_under_ lock > >> > call tree, it makes things worse. As the pcidevs_lock is a big lock, then > >> > Frequent memory modification may block the pci-device assign due > >> > to the pcidevs_lock. if I try to split the pcidevs_lock into small locks. > >> > It may takes a great deal of time to make it stable. > >> > >> I don't understand this, namely in the context of my suggestion to > >> simply pass down a flag indicating whether the lock is being held > >> (and hence acquiring it only in the most narrow scope if not already owning > it). > >> > > > > This is also an idea. > > BTW, Does the lock refer to pcidevs_lock? > > Yes, for now I assume that only that lock actually needs special attention. > Thanks. I think so too. -Quan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |