[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections

To: Kevin Brodsky <kevin.brodsky@xxxxxxx>
From: Alexander Gordeev <agordeev@xxxxxxxxxxxxx>
Date: Tue, 9 Sep 2025 16:38:46 +0200
Cc: David Hildenbrand <david@xxxxxxxxxx>, linux-mm@xxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, Andreas Larsson <andreas@xxxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, Borislav Petkov <bp@xxxxxxxxx>, Catalin Marinas <catalin.marinas@xxxxxxx>, Christophe Leroy <christophe.leroy@xxxxxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>, "David S. Miller" <davem@xxxxxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Jann Horn <jannh@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, "Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx>, Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>, Madhavan Srinivasan <maddy@xxxxxxxxxxxxx>, Michael Ellerman <mpe@xxxxxxxxxxxxxx>, Michal Hocko <mhocko@xxxxxxxx>, Mike Rapoport <rppt@xxxxxxxxxx>, Nicholas Piggin <npiggin@xxxxxxxxx>, Peter Zijlstra <peterz@xxxxxxxxxxxxx>, Ryan Roberts <ryan.roberts@xxxxxxx>, Suren Baghdasaryan <surenb@xxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Vlastimil Babka <vbabka@xxxxxxx>, Will Deacon <will@xxxxxxxxxx>, Yeoreum Yun <yeoreum.yun@xxxxxxx>, linux-arm-kernel@xxxxxxxxxxxxxxxxxxx, linuxppc-dev@xxxxxxxxxxxxxxxx, sparclinux@xxxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx
Delivery-date: Tue, 09 Sep 2025 14:39:57 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Sep 09, 2025 at 03:49:46PM +0200, Kevin Brodsky wrote:
> On 09/09/2025 13:54, David Hildenbrand wrote:
> > On 09.09.25 13:45, Alexander Gordeev wrote:
> >> On Tue, Sep 09, 2025 at 12:09:48PM +0200, David Hildenbrand wrote:
> >>> On 09.09.25 11:40, Alexander Gordeev wrote:
> >>>> On Tue, Sep 09, 2025 at 11:07:36AM +0200, David Hildenbrand wrote:
> >>>>> On 08.09.25 09:39, Kevin Brodsky wrote:
> >>>>>> arch_{enter,leave}_lazy_mmu_mode() currently have a stateless API
> >>>>>> (taking and returning no value). This is proving problematic in
> >>>>>> situations where leave() needs to restore some context back to its
> >>>>>> original state (before enter() was called). In particular, this
> >>>>>> makes it difficult to support the nesting of lazy_mmu sections -
> >>>>>> leave() does not know whether the matching enter() call occurred
> >>>>>> while lazy_mmu was already enabled, and whether to disable it or
> >>>>>> not.
> >>>>>>
> >>>>>> This patch gives all architectures the chance to store local state
> >>>>>> while inside a lazy_mmu section by making enter() return some value,
> >>>>>> storing it in a local variable, and having leave() take that value.
> >>>>>> That value is typed lazy_mmu_state_t - each architecture defining
> >>>>>> __HAVE_ARCH_ENTER_LAZY_MMU_MODE is free to define it as it sees fit.
> >>>>>> For now we define it as int everywhere, which is sufficient to
> >>>>>> support nesting.
> >>>> ...
> >>>>>> {
> >>>>>> + lazy_mmu_state_t lazy_mmu_state;
> >>>>>> ...
> >>>>>> - arch_enter_lazy_mmu_mode();
> >>>>>> + lazy_mmu_state = arch_enter_lazy_mmu_mode();
> >>>>>> ...
> >>>>>> - arch_leave_lazy_mmu_mode();
> >>>>>> + arch_leave_lazy_mmu_mode(lazy_mmu_state);
> >>>>>> ...
> >>>>>> }
> >>>>>>
> >>>>>> * In a few cases (e.g. xen_flush_lazy_mmu()), a function knows that
> >>>>>>      lazy_mmu is already enabled, and it temporarily disables it by
> >>>>>>      calling leave() and then enter() again. Here we want to ensure
> >>>>>>      that any operation between the leave() and enter() calls is
> >>>>>>      completed immediately; for that reason we pass
> >>>>>> LAZY_MMU_DEFAULT to
> >>>>>>      leave() to fully disable lazy_mmu. enter() will then
> >>>>>> re-enable it
> >>>>>>      - this achieves the expected behaviour, whether nesting
> >>>>>> occurred
> >>>>>>      before that function was called or not.
> >>>>>>
> >>>>>> Note: it is difficult to provide a default definition of
> >>>>>> lazy_mmu_state_t for architectures implementing lazy_mmu, because
> >>>>>> that definition would need to be available in
> >>>>>> arch/x86/include/asm/paravirt_types.h and adding a new generic
> >>>>>>     #include there is very tricky due to the existing header soup.
> >>>>>
> >>>>> Yeah, I was wondering about exactly that.
> >>>>>
> >>>>> In particular because LAZY_MMU_DEFAULT etc resides somewehere
> >>>>> compeltely
> >>>>> different.
> >>>>>
> >>>>> Which raises the question: is using a new type really of any
> >>>>> benefit here?
> >>>>>
> >>>>> Can't we just use an "enum lazy_mmu_state" and call it a day?
> >>>>
> >>>> I could envision something completely different for this type on s390,
> >>>> e.g. a pointer to a per-cpu structure. So I would really ask to stick
> >>>> with the current approach.
> 
> This is indeed the motivation - let every arch do whatever it sees fit.
> lazy_mmu_state_t is basically an opaque type as far as generic code is
> concerned, which also means that this API change is the first and last
> one we need (famous last words, I know). 
> 
> I mentioned in the cover letter that the pkeys-based page table
> protection series [1] would have an immediate use for lazy_mmu_state_t.
> In that proposal, any helper writing to pgtables needs to modify the
> pkey register and then restore it. To reduce the overhead, lazy_mmu is
> used to set the pkey register only once in enter(), and then restore it
> in leave() [2]. This currently relies on storing the original pkey
> register value in thread_struct, which is suboptimal and most
> importantly doesn't work if lazy_mmu sections nest. With this series, we
> could instead store the pkey register value in lazy_mmu_state_t
> (enlarging it to 64 bits or more).
> 
> I also considered going further and making lazy_mmu_state_t a pointer as
> Alexander suggested - more complex to manage, but also a lot more flexible.
> 
> >>> Would that integrate well with LAZY_MMU_DEFAULT etc?
> >>
> >> Hmm... I though the idea is to use LAZY_MMU_* by architectures that
> >> want to use it - at least that is how I read the description above.
> >>
> >> It is only kasan_populate|depopulate_vmalloc_pte() in generic code
> >> that do not follow this pattern, and it looks as a problem to me.
> 
> This discussion also made me realise that this is problematic, as the
> LAZY_MMU_{DEFAULT,NESTED} macros were meant only for architectures'
> convenience, not for generic code (where lazy_mmu_state_t should ideally
> be an opaque type as mentioned above). It almost feels like the kasan
> case deserves a different API, because this is not how enter() and
> leave() are meant to be used. This would mean quite a bit of churn
> though, so maybe just introduce another arch-defined value to pass to
> leave() for such a situation - for instance,
> arch_leave_lazy_mmu_mode(LAZY_MMU_FLUSH)?

What about to adjust the semantics of apply_to_page_range() instead?

It currently assumes any caller is fine with apply_to_pte_range() to
enter the lazy mode. By contrast, kasan_(de)populate_vmalloc_pte() are
not fine at all and must leave the lazy mode. That literally suggests
the original assumption is incorrect.

We could change int apply_to_pte_range(..., bool create, ...) to e.g.
apply_to_pte_range(..., unsigned int flags, ...) and introduce a flag
that simply skips entering the lazy mmu mode.

Thanks!

Follow-Ups:
- Re: [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
  - From: Kevin Brodsky

References:
- [PATCH v2 0/7] Nesting support for lazy MMU mode
  - From: Kevin Brodsky
- [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
  - From: Kevin Brodsky
- Re: [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
  - From: David Hildenbrand
- Re: [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
  - From: Alexander Gordeev
- Re: [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
  - From: David Hildenbrand
- Re: [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
  - From: Alexander Gordeev
- Re: [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
  - From: David Hildenbrand
- Re: [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
  - From: Kevin Brodsky

Prev by Date: Re: [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
Next by Date: [PATCH v5 1/2] efi: Add a function to check if Secure Boot mode is enabled
Previous by thread: Re: [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
Next by thread: Re: [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.