[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 5/5] mm: Don't hold heap lock in alloc_heap_pages() longer than necessary
On 30/08/17 13:59, Boris Ostrovsky wrote: >>> This patch has been applied to staging, but its got problems. The >>> following crash is rather trivial to provoke: >>> >>> ~Andrew >>> >>> (d19) Test result: SUCCESS >>> (XEN) ----[ Xen-4.10-unstable x86_64 debug=y Tainted: H ]---- >>> (XEN) CPU: 5 >>> (XEN) RIP: e008:[<ffff82d0802252fc>] >>> page_alloc.c#free_heap_pages+0x786/0x7a1 >>> ... >>> (XEN) Pagetable walk from ffff82ffffffffe4: >>> (XEN) L4[0x105] = 00000000abe5b063 ffffffffffffffff >>> (XEN) L3[0x1ff] = 0000000000000000 ffffffffffffffff >> Some negative offset into somewhere, it seems. Upon second >> look I think the patch is simply wrong in its current shape: >> free_heap_pages() looks for page_state_is(..., free) when >> trying to merge chunks, while alloc_heap_pages() now sets >> PGC_state_inuse outside of the locked area. I'll revert it right >> away. > Yes, so we do need to update page state under heap lock. I'll then move > scrubbing (and checking) only to outside the lock. > > I am curious though, what was the test to trigger this? I ran about 100 > parallel reboots under memory pressure and never hit this. # git clone git://xenbits.xen.org/xtf.git # cd xtf # make -j4 -s # ./xtf-runner -qa Purposefully, ./xtf-runner doesn't synchronously wait for VMs to be fully destroyed before starting the next test. (There is an ~800ms added delay to synchronously destroy HVM guests, over PV, which I expect is down to an interaction with qemu. I got sufficiently annoyed that I coded around the issue.) As a result, destruction of one domain will be happening while construction of the next one is happening. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |