Xen project Mailing List

Re: [PATCH] x86: enable interrupts around dump_execstate()

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Wed, 14 Sep 2022 12:13:49 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=OHynH6+WXjduIMM1wGlno8nKFnn1GiKfGrZapeQUmzE=; b=NS9k5mzaMr1DHKpk2TSMfSTBbZE4x1IgmcuTMHMFunOhSIOZxRKGuXddXM2EaUhY4WixPRb0Pq87/75xoC+M/CpSS/zKHvzpvEc6yYu3s3k3Yb0oHofzeh1il91c5Ms2yisWOCuH5pxj7ROpF7SHw9SqShGxnlwvaeYCoYOsu0yyo9aQK+JC9cjPwXQWWzUQK8pkHVv9DZBSsGHHY+qW5l6jnZguNK4dhsYQV9pA00v25uoh07uGeBAO1UhQBkeCw95n6BNBW8UuZxSC5wzfy0Z7nMVu2DsMHDsArwoHQOWzCIDFlUVvrpdnrpkxo5JzyP7L9wDs5w6iYRIDEHCT2w==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cLFJJxH6iLNOhBfuiBf3GgjmJvt5qKD1cATBrNJ5YPaqmhbZXEPgT2QC980d5x51xMLUeelOEm8iW44Qk+48QgfcvQxACWNlL1fNBtr73jvSCD0UQqakwhq8U7l8BdRTNocrDZLcuuj05KPxmjlw97S5+uj9a212kKoX6WtatukZPJQE3C49PqqZSXeotmDbLyHw5gbRdD8XhKXlm3vFMMH49EPaRj1JCJUz/nXnpuwWc6/ACt3/5nlHF+hfycMub5dgS9H7gNRn3cSjdVTFlauVjpF8mujx6LUDMZAwvNAhH43GPdnPZWxqNy0RuWfsPXZYcuwItm7UC5XndWyVHw==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>

Delivery-date: Wed, 14 Sep 2022 10:14:00 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 14.09.2022 11:13, Roger Pau Monné wrote: > On Wed, Sep 14, 2022 at 10:31:34AM +0200, Jan Beulich wrote: >> On 14.09.2022 10:14, Jan Beulich wrote: >>> On 13.09.2022 16:50, Roger Pau Monné wrote: >>>> On Mon, Dec 13, 2021 at 04:12:55PM +0100, Jan Beulich wrote: >>>>> show_hvm_stack() requires interrupts to be enabled to avoids triggering >>>>> the consistency check in check_lock() for the p2m lock. To do so in >>>>> spurious_interrupt() requires adding reentrancy protection / handling >>>>> there. >>>> >>>> There's also an ASSERT(!in_irq()) in _percpu_write_lock() that will >>>> trigger when trying to acquire the p2m lock from spurious_interrupt() >>>> context, as p2m_lock() -> mm_write_lock() -> _mm_write_lock -> >>>> percpu_write_lock(). >>> >>> s/will/may/ since spurious_interrupt() doesn't itself use irq_enter(), > > do_IRQ() does call irq_enter(), and that's the caller of > spurious_interrupt() AFAICT. Hmm, you're right. I was mislead by smp_call_function_interrupt() explicitly using irq_{enter,exit}(). I guess that should have been removed in b57458c1d02b ("x86: All vectored interrupts go through do_IRQ()"). I guess I need to either open-code the variant of in_irq() I'd need, or (perhaps better for overall state) explicitly irq_exit() before the check and irq_enter() after the call. Thoughts? >>> but yes - we could nest inside a lower priority interrupt. I'll make >>> local_irq_enable() depend on !in_irq(). >> >> Upon further thought I guess more precautions are necessary: We might >> have interrupted code holding the P2M lock already, and we might also >> have interrupted code holding another MM lock precluding acquiring of >> the P2M lock. All of this probably plays into Andrew's concerns, yet >> still I don't view it as a viable route to omit the stack dump for HVM >> domains, and in particular for PVH Dom0. Sadly I can't think of any >> better approach ... > > Yes, I also had those concerns. The mm locks are recursive, but > spurious_interrupt() hitting in the middle of code already holding any > mm lock is likely to end up triggering the mm lock order checker. Guarding against this is possible, while ... > One (likely very risky option ATM) is to introduce a per pCPU flag > that when set will turn all mm locks into noops, and use it here in > order to avoid any locking issues. This could introduce two issues at > least: first one is how resilient page walking routines are against > page tables changing under their feet. The second one is that any > page table walker p2m helper should avoid doing modifications to the > p2m, so no P2M_ALLOC or P2M_UNSHARE flags could be used. ... personally I view this as too risky. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.