[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] RE: Intel GPU pass-through with > 3G
I've done a patch like this where I set a flag d->iommu_dont_flush at the beginning of my batched function then an explicit call to flush the iotlb at the end. It's not a very nice way of solving this problem maybe it would be better to have a range/batch interface at the p2m_set_entry level. Jean On 12 November 2010 14:18, Daniel De Graaf <dgdegra@xxxxxxxxxxxxx> wrote: > I have also noticed this issue (9ms IOMMU flush), although I not during > domain creation. The path in which I observed it is page remapping when > using map_grant_ref. I haven't tested a DomU with over 3G of memory, > however; the delay may also be present in that case on my platform. > > I have done some work to try to add an 'order' parameter to iommu_map_page, > but it isn't stable yet; if this is the only way to get around the slow > flush, I will look at finishing it. > > Would it be possible to add a flag to delay IOMMU flushing until after a > batch update is finished? A single flush at the end, even if expensive, > would be faster than 10ms per page on mappings of a significant size. This > is also likely to be a less intrusive patch. > > In case you're interested, my platform is a Dell Optiplex 755, 4G RAM: > > # lspci > 00:00.0 Host bridge: Intel Corporation 82Q35 Express DRAM Controller (rev 02) > 00:01.0 PCI bridge: Intel Corporation 82Q35 Express PCI Express Root Port > (rev 02) > 00:02.0 VGA compatible controller: Intel Corporation 82Q35 Express Integrated > Graphics Controller (rev 02) > 00:02.1 Display controller: Intel Corporation 82Q35 Express Integrated > Graphics Controller (rev 02) > 00:19.0 Ethernet controller: Intel Corporation 82566DM-2 Gigabit Network > Connection (rev 02) > 00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI > Controller #4 (rev 02) > 00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI > Controller #5 (rev 02) > 00:1a.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI > Controller #2 (rev 02) > 00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio > Controller (rev 02) > 00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 > (rev 02) > 00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI > Controller #1 (rev 02) > 00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI > Controller #2 (rev 02) > 00:1d.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI > Controller #3 (rev 02) > 00:1d.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI > Controller #1 (rev 02) > 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92) > 00:1f.0 ISA bridge: Intel Corporation 82801IO (ICH9DO) LPC Interface > Controller (rev 02) > 00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port > SATA AHCI Controller (rev 02) > 00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev > 02) > > -- > Daniel De Graaf > National Security Agency > > On 11/10/2010 07:04 PM, Kay, Allen M wrote: >> Jean, >> >> Do you see any boot time difference between passing through integrated >> graphics for the very first time and the subsequent times? Which platform >> are you using? >> >> Allen >> >> -----Original Message----- >> From: Jean Guyader [mailto:jean.guyader@xxxxxxxxx] >> Sent: Wednesday, November 10, 2010 1:50 PM >> To: xen-devel@xxxxxxxxxxxxxxxxxxx >> Cc: Kay, Allen M >> Subject: Intel GPU pass-through with > 3G >> >> Hello, >> >> I'm passing through a graphic card to a guest that has more than 3G of >> RAM (4G to be precise in my case). >> >> What happen is that the VM creation is stuck in the process, so I put >> some tracing in the Xen code to see what >> was taking the time. I discovered that the guest was stuck in >> hvmloader inside this loop: >> >> while ( (pci_mem_start >> PAGE_SHIFT) < hvm_info->low_mem_pgend ) >> { >> struct xen_add_to_physmap xatp; >> if ( hvm_info->high_mem_pgend == 0 ) >> hvm_info->high_mem_pgend = 1ull << (32 - PAGE_SHIFT); >> xatp.domid = DOMID_SELF; >> xatp.space = XENMAPSPACE_gmfn; >> xatp.idx = --hvm_info->low_mem_pgend; >> xatp.gpfn = hvm_info->high_mem_pgend++; >> if ( hypercall_memory_op(XENMEM_add_to_physmap, &xatp) != 0 ) >> BUG(); >> } >> >> This loop relocate the RAM on the top to leave so space for the PCI BARs. >> It's loop on each page so in my case it's quite a big loop because the >> GPU has a BAR of 256M. >> >> So the interesting is that the function add_to_physmap takes most of >> the time. I believe >> that what takes most part of it is the iommu iotlb flush that come >> with the iommu_map_pages >> or the iommu_unmap_page which are called when we manipulate the p2m table. >> >> In my case the iommu flush take a very long time (because of the intel >> gpu ?), about 10 >> milliseconds. So if I'm patient enough my domain will start, about 10 >> minutes. >> >> A way to go will be to create a range interface to iommu_map_page >> iommu_unmap_page >> since iommu_flush are so expensive. Then some work need to be done to >> add a range interface >> to all the function between add_to_physmap and the p2m_set_entry which >> would be a big >> patch. I hope there is another way out of this problem. >> >> Thanks, >> Jean > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |