[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 bpf-next 2/2] mm: Introduce VM_SPARSE kind and vm_area_[un]map_pages().
> > This interface and in general VM_SPARSE would be useful for > > dynamically grown kernel stacks [1]. However, the might_sleep() here > > would be a problem. We would need to be able to handle > > vm_area_map_pages() from interrupt disabled context therefore no > > sleeping. The caller would need to guarantee that the page tables are > > pre-allocated before the mapping. > > Sounds like we'd need to differentiate two kinds of sparse regions. > One that is really sparse where page tables are not populated (bpf use case) > and another where only the pte level might be empty. > Only the latter one will be usable for such auto-grow stacks. > > Months back I played with this idea: > https://git.kernel.org/pub/scm/linux/kernel/git/ast/bpf.git/commit/?&id=ce63949a879f2f26c1c1834303e6dfbfb79d1fbd > that > "Make vmap_pages_range() allocate page tables down to the last (PTE) level." > Essentially pass NULL instead of 'pages' into vmap_pages_range() > and it will populate all levels except the last. Yes, this is what is needed, however, it can be a little simpler with kernel stacks: given that the first page in the vm_area is mapped when stack is first allocated, and that the VA range is aligned to 16K, we actually are guaranteed to have all page table levels down to pte pre-allocated during that initial mapping. Therefore, we do not need to worry about allocating them later during PFs. > Then the page fault handler can service a fault in auto-growing stack > area if it has a page stashed in some per-cpu free list. > I suspect this is something you might need for > "16k stack that is populated on fault", > plus a free list of 3 pages per-cpu, > and set_pte_at() in pf handler. Yes, what you described is exactly what I am working on: using 3-pages per-cpu to handle kstack page faults. The only thing that is missing is that I would like to have the ability to call a non-sleeping version of vm_area_map_pages(). Pasha
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |