[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.
* Konrad Rzeszutek Wilk (konrad.wilk@xxxxxxxxxx) wrote: > > > + * Is the DMA (Bus) address within our bounce buffer (start and end). > > > + */ > > > + int (*is_swiotlb_buffer)(struct swiotlb_engine *, dma_addr_t dev_addr, > > > + phys_addr_t phys); > > > + > > > > Why is this implementation specific? > > In the current implementation, they both use the physical address and > do a simple check: > > return paddr >= virt_to_phys(io_tlb_start) && > paddr < virt_to_phys(io_tlb_end); > > That for virtualized environments where a PCI device is passed in would > work too. > > Unfortunately the problem is when we provide a channel of communication > with another domain and we end up doing DMA on behalf of another guest. > The short description of the problem is that a page of memory is shared > with another domain and the mapping in our domain is correct (bus->physical) > the other way (virt->physical->bus) is incorrect for the duration of this page > being shared. Hence we need to verify that the page is local to our > domain, and for that we need the bus address to verify that the > addr == physical->bus(bus->physical(addr)) where addr is the bus > address (dma_addr_t). If it is not local (shared with another domain) > we MUST not consider it as a SWIOTLB buffer as that can lead to > panics and possible corruptions. The trick here is that the phys->virt > address can fall within the SWIOTLB buffer for pages that are > shared with another domain and we need the DMA address to do an extra check. > > The long description of the problem is: > > You are the domain doing some DMA on behalf of another domain. The > simple example is you are servicing a block device to the other guests. > One way to implement this is to present a one page ring buffer where > both domains move the producer and consumer indexes around. Once you get > a request (READ/WRITE), you use the virtualized channels to "share" that page > into your domain. For this you have a buffer (2MB or bigger) wherein for > pages that shared in to you, you over-write the phys->bus mapping. > That means that the phys->phys translation is altered for the duration > of this request being out-standing. Once it is completed, the phys->bus > translation is restored. > > Here is a little diagram of what happens when a page is shared (and lets > assume that we have a situation where virt #1 == virt #2, which means > that phys #1 == phys #2). > > (domain 2) virt#1->phys#1---\ > +- bus #1 > (domain 3) virt#2->phys#2 ---/ > > (phys#1 points to bus #1, and phys#2 points to bus #1 too). > > The converse of the above picture is not true: > > /---> phys #1-> virt #1. (domain 2). > bus#1 + > \---> phys #2-> virt #2. (domain 3). > > phys #1 != phys #2 and hence virt #1 != virt #2. > > When a page is not shared: > > (domain 2) virt #1->phys #1--> bus #1 > (domain 3) virt #2->phys #2--> bus #2 > > bus #1 -> phys #1 -> virt #1 (domain 2) > bus #2 -> phys #2 -> virt #2 (domain 3) > > The bus #1 != bus #2, but phys #1 could be same as phys #2 (since > there are just PFNs). And virt #1 == virt #2. > > The reason for these is that each domain has its own memory layout where > the memory starts at pfn 0, not at some higher number. So each domain > sees the physical address identically, but the bus address MUST point > to different areas (except when sharing) otherwise one domain would > over-write another domain, ouch. > > Furthermore when a domain is allocated, the pages for the domain are not > guaranteed to be linearly contiguous so we can't guarantee that phys == bus. > > So to guard against the situation in which phys #1 ->virt comes out with > an address that looks to be within our SWIOTLB buffer we need to do the > extra check: > > addr == physical->bus(bus->physical(addr)) where addr is the bus > address > > And for scenarios where this is not true (page belongs to another > domain), that page is not in the SWIOTLB (even thought the virtual and > physical address point to it). > > > > + /* > > > + * Is the DMA (Bus) address reachable by the PCI device?. > > > + */ > > > + bool (*dma_capable)(struct device *, dma_addr_t, phys_addr_t, size_t); > > I mentioned in the previous explanation that when a domain is allocated, > the pages are not guaranteed to be linearly contiguous. > > For bare-metal that is not the case and 'dma_capable' just checks > the device DMA mask against the bus address. > > For virtualized environment we do need to check if the pages are linearly > contiguous for the size request. > > For that we need the physical address to iterate over them doing the > phys->bus#1 translation and checking whether the (phys+1)->bus#2 > bus#1 == bus#2 + 1. Right, for both of those cases I was thinking you could make that the base logic and the existing helpers to do addr translation would be enough. But that makes more sense when compiling for a specific arch (i.e. the checks would be noops and compile away when !xen) as opposed to a dynamic setup like this. thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |