[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Intel-gfx] [PATCH 3/6] drm/i915: use vmap in shmem_pin_map
On 22/09/2020 17:33, Christoph Hellwig wrote: On Tue, Sep 22, 2020 at 05:13:45PM +0100, Tvrtko Ursulin wrote:void *shmem_pin_map(struct file *file) { - const size_t n_pte = shmem_npte(file); - pte_t *stack[32], **ptes, **mem;Chris can comment how much he'd miss the 32 page stack shortcut.I'd like to see a profile that claim that kmalloc matters in a path that does a vmap and reads pages through the page cache. Especially when the kmalloc saves doing another page cache lookup on the free side. Only reason I can come up with now is if mapping side is on a latency sensitive path, while un-mapping is lazy/delayed so can be more costly. Then fast map and extra cost on unmap may make sense. It more applies to the other i915 patch, which implements a much more used API, but whether or not we can demonstrate any difference in the perf profiles I couldn't tell you without trying to collect some. Is there something in vmap() preventing us from freeing the pages array here? I can't spot anything that is holding on to the pointer. Or it was just a sketch before you realized we could walk the vm_area? Also, I may be totally misunderstanding something, but I think you need to assign area->pages manually so shmem_unpin_map can access it below.We need area->pages to hold the pages for the free side. That being said the patch I posted is broken because it never assigned to that. As said it was a sketch. This is the patch I just rebooted into on my Laptop: http://git.infradead.org/users/hch/misc.git/commitdiff/048522dfa26b6667adfb0371ff530dc263abe829 it needs extra prep patches from the series: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/alloc_vm_areamapping_clear_unevictable(file->f_mapping); - __shmem_unpin_map(file, ptr, shmem_npte(file)); + for (i = 0; i < shmem_npages(file); i++) + put_page(area->pages[i]); + kvfree(area->pages); + vunmap(ptr);Is the verdict from mm experts that we can't use vfree due __free_pages vs put_page differences?Switched to vfree now.Could we get from ptes to pages, so that we don't have to keep the area->pages array allocated for the duration of the pin?We could do vmalloc_to_page, but that is fairly expensive (not as bad as reading from the page cache..). Are you really worried about the allocation? Not so much given how we don't even use shmem_pin_map outside selftests.If we start using it I expect it will be for tiny objects anyway. Only if they end up being pinned for the lifetime of the driver, it may be a pointless waste of memory compared to the downsides of vmalloc_to_page. But we can revisit this particular edge case optimization if the need arises. I'll look at your other i915 patch tomorrow. Regards, Tvrtko
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |