Xen project Mailing List

Re: [Xen-devel] VT-d async invalidation for Device-TLB.

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Wed, 10 Jun 2015 09:04:43 +0100

Cc: Yang Z Zhang <yang.z.zhang@xxxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Donald D Dugger <donald.d.dugger@xxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Wed, 10 Jun 2015 08:05:08 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 03.06.15 at 09:49, <quan.xu@xxxxxxxxx> wrote: > Design Overview > ============= > This design implements a non-spinning model for Device-TLB invalidation - > using > an interrupt based mechanism. Each domain maintains a invalidation table, and > the hypervisor has an entry of invalidation tables. The invalidation table entry? Do you mean array or table? > keeps the count of in-flight Device-TLB invalidation queues, and also > provides > the same polling parameter for mutil in-flight Device-TLB invalidation queues > of each domain. Which "same polling parameter"? I.e. I'm not sure what this is about in the first place. > When a domain issues a request to Device-TLB invalidation queue, update > invalidation table's count of in-flight Device-TLB invalidation queue and > assign the Status Data of wait descriptor of the invalidation queue. An > interrupt is sent out to the hypervisor once a Device-TLB invalidation > request > is done. In interrupt handler, we will schedule a soft-irq to do the > following > check: > if invalidation table's count of in-flight Device-TLB invalidation queues > == polling parameter: > This domain has no in-flight invalidation requests. > else > This domain has in-flight invalidation requests. > The domain is put into the "blocked" status if it has in-flight Device-TLB > invalidation requests, and awoken when all the requests are done. A fault > event will be generated if an invalidation failed. We can either crash the > domain or crash Xen. Crashing Xen can't really be considered an option except when you can't contain the failed invalidation to a particular VM (which, from what was written above, should never happen). > For Context Invalidation and IOTLB invalidation without Device-TLB > invalidation, Invalidation Queue flushes synchronous invalidation as > before(This is a tradeoff and the cost of interrupt is overhead). DMAR_OPERATION_TIMEOUT being 1s, are you saying that you're not intending to replace the current spinning for the non-ATS case? Considering that expiring these loops results in panic()s, I would expect these to become asynchronous _and_ contained to the affected VM alongside the ATS induced changed behavior. You talking of overhead - can you quantify that? > More details: > > 1. invalidation table. We define iommu _invl structure in domain. > Struct iommu _invl { > volatile u64 iommu _invl _poll_slot :62; > domid_t dom_id; > u64 iommu _invl _status_data :32; > }__attribute__ ((aligned (64))); > > iommu _invl _poll_slot: Set it equal to the status address of wait > descriptor when the invalidation queue is with Device-TLB. > dom_id: Keep the id of the domain. > iommu _invl _status_data: Keep the count of in-flight queue with > Device-TLB > invalidation. Without further explanation above/below I don't think I really understand the purpose of this structure, nor its organization: Is this something imposed by the VT-d specification? If so, a reference to the respective section in the spec would be useful. If not, I can't see why the structure is laid out the (odd) way it is. > 2. Modification to Device IOTLB invalidation: > - Enabled interrupt notification when hardware completes the > invalidations: > Set FN, IF and SW bits in Invalidation Wait Descriptor. The reason A god design document would either give a (short) explanation of these bits, or at the very least a precise reference to where in the spec they're being defined. The way the VT-d spec is organized I generally find it quite hard to locate the definition of specific fields when I have only a vague reference in hand. Yet reading the doc here should require the reader to spend meaningful extra amounts of time hunting down the corresponding pieces of the spec. > why also set SW bit is that the interrupt for notification is global not per > domain. So we still need to poll the status address to know which domain's > flush request is > completed in interrupt handler. With the above taken care of, I would then hope to also be able to understand this (kind of an) explanation. > - A new per-domain flag (iommu_pending_flush) is used to track the flush > status of IOTLB invalidation with Device-TLB invalidation: > iommu_pending_flush will be set before flushing the Device-TLB > invalidation. What is "flushing an invalidation" supposed to mean? I think there's some problem with the wording here... > 4. New interrupt handler for invalidation completion: > - when hardware completes the invalidations with Device IOTLB, it > generates an interrupt to notify hypervisor. > - In interrupt handler, we will schedule a soft-irq to handle the > finished > invalidations. > - soft-irq to handle finished invalidation: > Scan the pending flush list > for each entry in list > check the values of iommu _invl _poll_slot and iommu _invl > _status_data in each domain's invalidation table. > if yes, clear iommu_pending_flush and invalidation table, then > wakeup the domain. Did you put some consideration into how long this list may get, and hence how long it may take you to iterate through the entire list? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.