[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
>On 22.12.2015 at 4:01pm <JBeulich@xxxxxxxx> wrote: > >>> On 22.12.15 at 08:40, <feng.wu@xxxxxxxxx> wrote: > > Maybe, there are still some misunderstanding about your expectation. > > Let me summarize it here. > > > > After Quan's patch-set, there are two types of error code: > > - -EOPNOTSUPP > > Now we only support and use software way to synchronize the > > invalidation, if someone calls queue_invalidate_wait() and passes sw > > with 0, then -EOPNOTSUPP is returned (Though this cannot happen in > > real world, since > > queue_invalidate_wait() is called only in one place and 1 is passed in to > > 'sw'). > > So I am not sure what should we do for this return value, if we really > > get that return value, it means the flush is not actually executed, so > > the iommu state is incorrect, the data is inconsistent. Do you think > > what should we do for this case? > > Since seeing this error would be a software bug, BUG() or ASSERT() are fine to > handle this specific case, if need be. > > > - -ETIMEDOUT > > For this case, Quan has elaborate a lot, IIUIC, the main gap between > > you and Quan is you think the error code should be propagated to the > > up caller, while in Quan's implementation, he deals with this error in > > invalidate_timeout() > > and device_tlb_invalidate_timeout(), hence no need to propagated it to > > up called, since the handling policy will crash the domain, so we > > don't think propagated error code is needed. Even we propagate it, the > > up caller still doesn't need to do anything for it. > > "Handling" an error by e.g. domain_crash() doesn't mean you don't need to also > modify (or at the very least inspect) callers: They may continue doing things > _assuming_ success. Of course you don't need to domain_crash() at all layers. > But errors from lower layers should, at least in most ordinary cases, lead to > higher layers bailing instead of continuing. > For Device-TLB flush error, I think we need to propagated error code. For IEC/iotlb/context flush error, if panic is acceptable, we can ignore the propagated error code. BTW, it is very challenge / tricky to handle all Of error, and some error is unrecoverable. As mentioned, it looks like rewriting Xen hypervisor. For -EOPNOTSUPP, we can print warning message. If it supports interrupt method, we can return 0 in queue_invalidate_wait(). Feng, thanks for your update! -Quan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |