[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 1/1] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32
- To: Dongli Zhang <dongli.zhang@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, x86@xxxxxxxxxx
- From: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
- Date: Mon, 1 Nov 2021 13:34:03 -0400
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0OR+bU3BY5aWZwZxbbc9vQd3Y31c9pbGp03dLAaITo8=; b=TzAFovZ7x7JLvpYhM+YdP7eS0JYrptPCkDYyiFiMVCg572oA3oWlFxQbdkbN/F1Dd3AecwXTtKpJkqXfkA3c+LkxAPVMf3lE5oczylI8J8/IrzqGbGkbN9PmDcP+eVRHvuIYjIRZGgcSvROeUa09cNiu6oIp2e6RtmQ68GfqejdgYfXV9MpxFGQA/Rro6oXgw7RlPE0lIhqgTKmS+VW1aixweJXFg6dIeyJ15w4FMMgoEi8hOgKn4gioDelEti9TOaLVkLndFU8w1AALlKWAhoYuHCstb9ihTJ/DDViqs+cU4TZyKKGscbuOL1oghh5j4n8h9tbdHBD5YPytB94kJA==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Nh2si2MOSizqwEuIk6yH0fL37ZO/Es6/eBDlG8d4qN1AjAIuSliXstiNQF1v3y9sfrZht8U0VpdW/6RL9G0niim/PzR6eXBFZSkKm3Xv2lN+74c6CFIXLVXQNZTN/lO6e71Pyv4Glet/cWFaWSwHiphCKerLMqkMWtltRXiPuE1tAwjeL8r/Eo62pI7DSIfkLhSSUiQfKPee3W0LGXAT50PWNHTu1y1hKaO5kQ7XTBjv7/PL4MfSfr/dWfCUvDNonycVYpv9ZFM07H4V4QcDKo1r0ZLRF1DCOjIfAh/g3UouKjxhv5yeKVBk+jgsDoomwNVt0rUYPONdRl9Y4bTKjw==
- Authentication-results: oracle.com; dkim=none (message not signed) header.d=none;oracle.com; dmarc=none action=none header.from=oracle.com;
- Cc: jgross@xxxxxxxx, sstabellini@xxxxxxxxxx, tglx@xxxxxxxxxxxxx, mingo@xxxxxxxxxx, bp@xxxxxxxxx, dave.hansen@xxxxxxxxxxxxxxx, hpa@xxxxxxxxx, joe.jin@xxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
- Delivery-date: Mon, 01 Nov 2021 17:34:47 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
On 10/27/21 9:25 PM, Dongli Zhang wrote:
The sched_clock() can be used very early since
commit 857baa87b642 ("sched/clock: Enable sched clock early"). In addition,
with commit 38669ba205d1 ("x86/xen/time: Output xen sched_clock time from
0"), kdump kernel in Xen HVM guest may panic at very early stage when
accessing &__this_cpu_read(xen_vcpu)->time as in below:
setup_arch()
-> init_hypervisor_platform()
-> x86_init.hyper.init_platform = xen_hvm_guest_init()
-> xen_hvm_init_time_ops()
-> xen_clocksource_read()
-> src = &__this_cpu_read(xen_vcpu)->time;
This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.
However, when Xen HVM guest panic on vcpu >= 32, since
xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when
vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic.
This patch delays xen_hvm_init_time_ops() to later in
xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is
registered when the boot vcpu is >= 32.
Another option is to always delay xen_hvm_init_time_ops() for any vcpus
(including vcpu=0). Since to delay xen_hvm_init_time_ops() may lead to
clock backward issue,
This is referring to
https://lists.xenproject.org/archives/html/xen-devel/2021-10/msg01516.html I
assume?
it is preferred to avoid that for regular boot (The
pv_sched_clock=native_sched_clock() is used at the very beginning until
xen_sched_clock() is registered). That requires to adjust
xen_sched_clock_offset. That's why we only delay xen_hvm_init_time_ops()
for vcpu>=32.
We delay only on VCPU>=32 because we want to avoid the clock going backwards
due to hypervisor problem pointed to be the link above, not because we need to
adjust xen_sched_clock_offset (which we could if we wanted).
This issue can be reproduced on purpose via below command at the guest
side when kdump/kexec is enabled:
"taskset -c 33 echo c > /proc/sysrq-trigger"
Reference:
https://lists.xenproject.org/archives/html/xen-devel/2021-10/msg00571.html
Cc: Joe Jin <joe.jin@xxxxxxxxxx>
Signed-off-by: Dongli Zhang <dongli.zhang@xxxxxxxxxx>
---
Changed since v1:
- Add commit message to explain why xen_hvm_init_time_ops() is delayed
for any vcpus. (Suggested by Boris Ostrovsky)
- Add a comment in xen_hvm_smp_prepare_boot_cpu() referencing the related
code in xen_hvm_guest_init(). (suggested by Juergen Gross)
arch/x86/xen/enlighten_hvm.c | 20 +++++++++++++++++++-
arch/x86/xen/smp_hvm.c | 8 ++++++++
2 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index e68ea5f4ad1c..7734dec52794 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -216,7 +216,25 @@ static void __init xen_hvm_guest_init(void)
WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_hvm, xen_cpu_dead_hvm));
xen_unplug_emulated_devices();
x86_init.irqs.intr_init = xen_init_IRQ;
- xen_hvm_init_time_ops();
+
+ /*
+ * Only MAX_VIRT_CPUS 'vcpu_info' are embedded inside 'shared_info'
+ * and the VM would use them until xen_vcpu_setup() is used to
+ * allocate/relocate them at arbitrary address.
+ *
+ * However, when Xen HVM guest panic on vcpu >= MAX_VIRT_CPUS,
+ * per_cpu(xen_vcpu, cpu) is still NULL at this stage. To access
+ * per_cpu(xen_vcpu, cpu) via xen_clocksource_read() would panic.
+ *
+ * Therefore we delay xen_hvm_init_time_ops() to
+ * xen_hvm_smp_prepare_boot_cpu() when boot vcpu is >= MAX_VIRT_CPUS.
+ */
+ if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
+ pr_info("Delay xen_hvm_init_time_ops() as kernel is running on
vcpu=%d\n",
+ xen_vcpu_nr(0));
+ else
+ xen_hvm_init_time_ops();
+
xen_hvm_init_mmu_ops();
#ifdef CONFIG_KEXEC_CORE
diff --git a/arch/x86/xen/smp_hvm.c b/arch/x86/xen/smp_hvm.c
index 6ff3c887e0b9..f99043df8bb5 100644
--- a/arch/x86/xen/smp_hvm.c
+++ b/arch/x86/xen/smp_hvm.c
@@ -19,6 +19,14 @@ static void __init xen_hvm_smp_prepare_boot_cpu(void)
*/
xen_vcpu_setup(0);
+ /*
+ * The xen_hvm_init_time_ops() is delayed from
+ * xen_hvm_guest_init() to here to avoid panic when the kernel
+ * boots from vcpu>=MAX_VIRT_CPUS (32).
+ */
How about
/* Deferred call to xen_hvm_init_time_ops(). See comment in
xen_hvm_guest_init() */
-boris
+ if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
+ xen_hvm_init_time_ops();
+
/*
* The alternative logic (which patches the unlock/lock) runs before
* the smp bootup up code is activated. Hence we need to set this up
|