[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.
> On January 27, 2016 at 6:48am, <Tian, Kevin> wrote: > > From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > > Sent: Tuesday, January 26, 2016 11:53 PM > > Once again: Before getting started, please assess which route is going > > to be the better one. Remember that we had already discussed and put > > aside some form of deferring the hiding of devices, so if you come > > back with a patch doing that again, you'll have to be able to explain > > why the alternative(s) are worse. > > > > Quan, could you list pros/cons of those alternatives based on discussion so > far? > Then we can decide which way should be done before you go to actual coding. > Earlier suggestion on hiding device immediately is under the assumption that > all > locks have been held. If this part becomes too complex, and you can explain > clearly that deferring the hiding action doesn't lead to any race condition, > then > people can see why you are proposing defer again. The following are pros/cons of those alternatives. It is also why I propose defer again. -- -- 1. Hiding the devices immediately Pros: * it makes whatever changes are ASAP after the Device-TLB flush error. Cons: * It may break the code path. * It may lead to any race condition. * Hiding the devices immediately is under the assumption that all locks have been held. Different locking state is possible for different call trees. This part becomes too complex. 2. deferring the hiding of devices Pros: *It doesn't lead to any race condition. *It makes existing calling chains with all required error handling completed. Cons: * It needs to maintain a flag. -- -- ----- How to defer the hiding of devices: When Device-TLB flush is error, it is to set a new flag and then continue existing calling chains with all required error handling completed. Only at safe place we can safely invoke pci_hide_device() variant to hide devices. We do a lazy hide until next time when it's assigned to another guest while the new flag is recognized. Furthermore explanation: A new flag: In my previous email, I introduced a "pci_dev->info.is_unassignable" flag. When Device-TLB flush is error, I think it is not a good idea to modify the pci_dev, As it is uncertain whether pcidevs_lock is acquired. I'd better introduce a new iommu bitmap array to register the device with BDF. It may be quite similar to 'iommu->domid_bitmap' Safe place: I think the beginning of reassign_device_ownership() is a safe place to hide the device which is registered with the new flag. reassign_device_ownership() is under lock. pci_hide_device() variant: Remove the pcidevs_lock, as the pcidevs_lock has been acquired at call tree of reassign_device_ownership(). Thanks!!! -Quan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |