[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
>On 12.12.2015 at 9:22pm, <quan.xu@xxxxxxxxx> wrote: > This patches are based on Kevin Tian's previous discussion 'Revisit >VT-d asynchronous flush issue'. > Fix current timeout concern and also allow limited ATS support in a light way: > 2. Fix vt-d flush timeout issue. > > If IOTLB/Context/IETC flush is timeout, we should think all > devices under this IOMMU cannot function correctly. > So for each device under this IOMMU we'll mark it as unassignable > and kill the domain owning the device. > Hi, Through research and investigation, when IEC/Iotlb/Context are flush error(VT-d is bug), IMO it is unavoidable to panic. The following are some reasons: 1. The below is the general platform topology, illustrated by VT-d spec. VT-d is a key component of the platform infrastructure in virtualization usage, providing DMA/Intr remapping capabilities. If such a key component VT-d is bug, it can't provide reliability for recording and reporting of DMA/Intr error to VMM that may otherwise corrupt memory or impact VM isolation. Processor ... Processor --------- --------- ^ | North Bridge -------------- <---> DRAM DMA/Intr Remapping ^^^^ |||| PCIe Devices vvvv 2. If VT-d is bug, does the hardware_domain continue to work with PCIe Devices / DRAM well with DMA remapping error? I think it is no. furthermore, i think VMM can NOT run a normal HVM domain without device-passthrough. 3. There are so many reasons for IEC/iotlb/Conetxt flush, .i.e. msi/ept... update. It distributed across the VMM source code, it is challenge to make sure callers actually honor errors and check all the way up the call trees. it looks like rewriting VMM source code. 4. Much more detail, some flush errors are very tricky. .i.e. how to deal with msi free with IEC flush error, restore or ignore it? Welcome your comments and correct me if i am wrong. thanks. -Quan > If Device-TLB flush is timeout, we'll mark the target ATS device > as unassignable and kill the domain owning > this device. > > If impacted domain is hardware domain, just throw out a warning. > It's an open here whether we want to kill > hardware domain (or directly panic hypervisor). Comments are welcomed. > > Device marked as unassignable will be disallowed to be further > assigned to any domain. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |