[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use

To: David Vrabel <david.vrabel@xxxxxxxxxx>
From: Pavel Emelyanov <xemul@xxxxxxxxxxxxx>
Date: Thu, 22 Aug 2013 14:16:07 +0400
Cc: "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, Andy Lutomirski <luto@xxxxxxxxxxxxxx>, Cyrill Gorcunov <gorcunov@xxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Xen-devel@xxxxxxxxxxxxx, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
Delivery-date: Fri, 23 Aug 2013 14:39:00 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 08/22/2013 01:32 PM, David Vrabel wrote:
> On 22/08/13 00:04, Linus Torvalds wrote:
>> On Wed, Aug 21, 2013 at 12:03 PM, Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote:
>>>
>>> I personally don't see bug here because
>>>
>>>  - this swapped page soft dirty bit is set for non-present entries only,
>>>    never for present ones, just at moment we form swap pte entry
>>>
>>>  - i don't find any code which would test for this bit directly without
>>>    is_swap_pte call
>>
>> Ok, having gone through the places that use swp_*soft_dirty(), I have
>> to agree. Afaik, it's only ever used on a swap-entry that has (by
>> definition) the P bit clear. So with or without Xen, I don't see how
>> it can make any difference.
>>
>> David/Konrad - did you actually see any issues, or was this just from
>> (mis)reading the code?
> 
> There are no Xen related bugs in the code, we were misreading it.
> 
> It was my call to raise this as a regression without a repro and clearly
> this was the wrong decision.
> 
> However, having looked at the soft dirty implementation and specifically
> the userspace ABI I think that it is far to closely coupled to the
> current implementation.  I think this will constrain future development
> of the feature should userspace require a more efficient ABI than
> scanning all of /proc/<pid>/pagemaps.
> 
> Minimal downtime during 'live' checkpointing of a running task needs the
> checkpointer to find and write out dirty pages faster than the task can
> dirty them.

Absolutely, but in "find and write" the "write" component is likely to take
the majority of time -- we can scan PTEs of a mapping MUCH faster, than
transmitting those over even 10Gbit link.

We actually see this IRL -- in CRIU there's an atomic test, that checks
mappings get dumped and restored properly. One of sub-tests is one 512Mb 
mapping.
With it total dump time _minus_ memory dump time (which includes not only 
pagemap
file scan, but also files, registers, process tree, sessions, etc.) is 
fractions 
of one second, while only the memory dump part's time is several seconds.

That said, the super-fast API for getting "what has changed" is not as tempting
to have as faster network/disk.

What is _more_ time consuming in iterative migration in our case is the need to
re-scan the whole /proc tree to get which processes had died and appeared, mess 
with /proc/pid/fd finding out what files were (re-)opened/closed/changed, 
talking
to sock_diag subsystem for sockets information and alike. However, we haven't 
yet
done careful analysis for what the slowest part is, but pagemap scans is 
definitely
not.

> This seems less likely to be possible if every iteration
> all PTEs have to be scanned by the checkpointer instead of (e.g.,)
> accessing a separate list of dirtied pages.

But we don't scan all the x64 virtual address space's PTEs, instead we first
analyze the /proc/pid/maps and scan only PTEs sitting in private mappings.

> David
> .
> 

Thanks,
Pavel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

References:
- [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: David Vrabel
- Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: Cyrill Gorcunov
- Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: Jan Beulich
- Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: Cyrill Gorcunov
- Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: Jan Beulich
- Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: Cyrill Gorcunov
- Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: David Vrabel
- Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: Cyrill Gorcunov
- Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: Cyrill Gorcunov
- Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: H. Peter Anvin
- Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: Cyrill Gorcunov
- Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: Linus Torvalds
- Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
  - From: David Vrabel

Prev by Date: Re: [Xen-devel] [xen-unstable] Commit 2ca9fbd739b8a72b16dd790d0fff7b75f5488fb8 AMD IOMMU: allocate IRTE entries instead of using a static mapping, makes dom0 boot process stall several times.
Next by Date: Re: [Xen-devel] cpuidle and un-eoid interrupts at the local apic
Previous by thread: Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
Next by thread: Re: [Xen-devel] Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts with existing use
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.