[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough
On Fri, Nov 12, 2010 at 2:33 PM, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote: >>> That sounds like the tachyon device is updating the wrong memory location. >>> How are you programming the memory location where thetachyon device is >>> suppose to touch? Are you using the value from pci_map_page or are you >>> using virt_to_phys? The virt_to_phys should be different from the >>> pci_map_page.. unless you allocated a coherent DMA pool using >>> pci_alloc_coherent in which case the virt_to_phys() values for that pool >>> should be the right MFNs. >> >> Our driver uses pci_map_single to get the physical addr to program the chip. > > OK. Good. >> >> >> One way you can figure this is doing something like this to make sure you >> got the right MFN: >> >> add these two: >> #include <xen/page.h> >> #include <asm/xen/page.h> >> >> phys_addr_t phys = page_to_phys(mem->pages[i]); >> + if (xen_pv_domain()) { >> + phys_addr_t xen_phys = PFN_PHYS(pfn_to_mfn( >> + page_to_pfn(mem->pages[i]))); >> + if (phys != xen_phys) { >> + printk(KERN_ERR "Fixing up: (0x%lx->0x%lx)." >> \ >> + " CODE UNTESTED!\n", >> + (unsigned long)phys, >> + (unsigned long)xen_phys); >> + WARN_ON_ONCE(phys != xen_phys); >> + phys = xen_phys; >> + } >> + } >> and using the 'phys' value from now. >> >> >> >> If this sounds like black magic, here is a short writeup >> http://wiki.xensource.com/xenwiki/XenPVOPSDRM >> >> look at "Why those patches" section. >> >> Lastly, are you using unsigned long for or the phys_addr_t typedefs? >> >> The driver uses dma_addr_t for physical address. > > Excellent. >> >> The more I think about your problem the more it sounds like a truncating >> issue. You said that it works just right (albeit slow) if you use >> 'swiotlb=force'. The slowness could be due to not using the pci_sync_* APIs >> to sync the DMA buffers.. But irregardless using bounce buffers will slow >> the DMA operations down. >> >> The driver do use pci_dma_sync_single_for_cpu or >> pci_dma_sync_single_for_device to sync the DMA buffers. Without these syncs, >> the driver would not work at all. > > <nods> That makes sense. >> >> Using the bounce buffers limits the DMA operations to under 32-bit. So could >> it be that you are using some casting macro that casts a PFN to unsigned >> long or vice-versa and we end up truncating it to 32-bit? (I've seen this >> issue actually with InfiniBand drivers back in RHEL5 days..). Lastly, do you >> set your DMA mask on the device to 32BIT? >> >> The tachyon chip supports both 32-bit & 45-bit dma. Some features need to >> set 32-bit physical addr to chip. Others need to set 45-bit physical addr to >> chip. > > Oh boy. That complicates it. > >> The driver doesn't set DMA mask on the device to 32 bit. > > Is it set then to 45bit? > We were not explicitly setting the DMA mask. pci_alloc_coherent was always returning 32 bits but pci_map_single was returning a 34-bit address which we truncate by casting it to a uint32_t since the Tachyon's HBA register is only 32 bits. With swiotlb=force, both returned 32 bits without explicitly setting the DMA mask. Once we set the mask to 32 bits using pci_set_dma_mask, the NMIs stopped. However with iommu=soft (and no more swiotlb=force), we're still stuck with the abysmal I/O performance (same as when we had swiotlb=force). In pvops domU (xen-pcifront-0.8.2), what does iommu=soft do? What's the default if we don't specify it? Without it, we get no I/Os (it seems the interrupts and/or DMA don't work). Are there any profiling tools you can suggest for domU? I was able to apply Dulloor's xenoprofile patch to our dom0 kernel (2.6.32.25-pvops) but not to xen-pcifront-0.8.2. - Dante _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |