[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.
. snip.. > You can move the comments to a kerneldoc section for proper > documentation. > > /** > * struct swiotlb_engine - short desc... > * @name: Name of the engine... > etc <nods>. .. snip .. > > + char *end; > > Isn't this still global to swiotlb, not specific to the backend impl.? Yes and no. Without the start/end, the "is_swiotlb_buffer" would not be able to determine if the passed in address is within the SWIOTLB buffer. > > > + /* > > + * The number of IO TLB blocks (in groups of 64) betweeen start and > > + * end. This is command line adjustable via setup_io_tlb_npages. > > + */ > > + unsigned long nslabs; > > Same here. > That one can be put back (make it part of lib/swiotlb.c) > > + > > + /* > > + * When the IOMMU overflows we return a fallback buffer. > > + * This sets the size. > > + */ > > + unsigned long overflow; > > + > > + void *overflow_buffer; > > And here. Ditto. ..snip .. > > + * Is the DMA (Bus) address within our bounce buffer (start and end). > > + */ > > + int (*is_swiotlb_buffer)(struct swiotlb_engine *, dma_addr_t dev_addr, > > + phys_addr_t phys); > > + > > Why is this implementation specific? In the current implementation, they both use the physical address and do a simple check: return paddr >= virt_to_phys(io_tlb_start) && paddr < virt_to_phys(io_tlb_end); That for virtualized environments where a PCI device is passed in would work too. Unfortunately the problem is when we provide a channel of communication with another domain and we end up doing DMA on behalf of another guest. The short description of the problem is that a page of memory is shared with another domain and the mapping in our domain is correct (bus->physical) the other way (virt->physical->bus) is incorrect for the duration of this page being shared. Hence we need to verify that the page is local to our domain, and for that we need the bus address to verify that the addr == physical->bus(bus->physical(addr)) where addr is the bus address (dma_addr_t). If it is not local (shared with another domain) we MUST not consider it as a SWIOTLB buffer as that can lead to panics and possible corruptions. The trick here is that the phys->virt address can fall within the SWIOTLB buffer for pages that are shared with another domain and we need the DMA address to do an extra check. The long description of the problem is: You are the domain doing some DMA on behalf of another domain. The simple example is you are servicing a block device to the other guests. One way to implement this is to present a one page ring buffer where both domains move the producer and consumer indexes around. Once you get a request (READ/WRITE), you use the virtualized channels to "share" that page into your domain. For this you have a buffer (2MB or bigger) wherein for pages that shared in to you, you over-write the phys->bus mapping. That means that the phys->phys translation is altered for the duration of this request being out-standing. Once it is completed, the phys->bus translation is restored. Here is a little diagram of what happens when a page is shared (and lets assume that we have a situation where virt #1 == virt #2, which means that phys #1 == phys #2). (domain 2) virt#1->phys#1---\ +- bus #1 (domain 3) virt#2->phys#2 ---/ (phys#1 points to bus #1, and phys#2 points to bus #1 too). The converse of the above picture is not true: /---> phys #1-> virt #1. (domain 2). bus#1 + \---> phys #2-> virt #2. (domain 3). phys #1 != phys #2 and hence virt #1 != virt #2. When a page is not shared: (domain 2) virt #1->phys #1--> bus #1 (domain 3) virt #2->phys #2--> bus #2 bus #1 -> phys #1 -> virt #1 (domain 2) bus #2 -> phys #2 -> virt #2 (domain 3) The bus #1 != bus #2, but phys #1 could be same as phys #2 (since there are just PFNs). And virt #1 == virt #2. The reason for these is that each domain has its own memory layout where the memory starts at pfn 0, not at some higher number. So each domain sees the physical address identically, but the bus address MUST point to different areas (except when sharing) otherwise one domain would over-write another domain, ouch. Furthermore when a domain is allocated, the pages for the domain are not guaranteed to be linearly contiguous so we can't guarantee that phys == bus. So to guard against the situation in which phys #1 ->virt comes out with an address that looks to be within our SWIOTLB buffer we need to do the extra check: addr == physical->bus(bus->physical(addr)) where addr is the bus address And for scenarios where this is not true (page belongs to another domain), that page is not in the SWIOTLB (even thought the virtual and physical address point to it). > > + /* > > + * Is the DMA (Bus) address reachable by the PCI device?. > > + */ > > + bool (*dma_capable)(struct device *, dma_addr_t, phys_addr_t, size_t); I mentioned in the previous explanation that when a domain is allocated, the pages are not guaranteed to be linearly contiguous. For bare-metal that is not the case and 'dma_capable' just checks the device DMA mask against the bus address. For virtualized environment we do need to check if the pages are linearly contiguous for the size request. For that we need the physical address to iterate over them doing the phys->bus#1 translation and checking whether the (phys+1)->bus#2 bus#1 == bus#2 + 1. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |