[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 01/17] perf: Protect perf_guest_cbs with RCU
- To: Sean Christopherson <seanjc@xxxxxxxxxx>, Peter Zijlstra <peterz@xxxxxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>, Will Deacon <will@xxxxxxxxxx>, Mark Rutland <mark.rutland@xxxxxxx>, Russell King <linux@xxxxxxxxxxxxxxx>, Marc Zyngier <maz@xxxxxxxxxx>, Catalin Marinas <catalin.marinas@xxxxxxx>, Guo Ren <guoren@xxxxxxxxxx>, Nick Hu <nickhu@xxxxxxxxxxxxx>, Greentime Hu <green.hu@xxxxxxxxx>, Vincent Chen <deanbo422@xxxxxxxxx>, Paul Walmsley <paul.walmsley@xxxxxxxxxx>, Palmer Dabbelt <palmer@xxxxxxxxxxx>, Albert Ou <aou@xxxxxxxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Borislav Petkov <bp@xxxxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>, x86@xxxxxxxxxx, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>
- From: Paolo Bonzini <pbonzini@xxxxxxxxxx>
- Date: Thu, 11 Nov 2021 08:26:58 +0100
- Authentication-results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pbonzini@xxxxxxxxxx
- Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>, Jiri Olsa <jolsa@xxxxxxxxxx>, Namhyung Kim <namhyung@xxxxxxxxxx>, James Morse <james.morse@xxxxxxx>, Alexandru Elisei <alexandru.elisei@xxxxxxx>, Suzuki K Poulose <suzuki.poulose@xxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>, Wanpeng Li <wanpengli@xxxxxxxxxxx>, Jim Mattson <jmattson@xxxxxxxxxx>, Joerg Roedel <joro@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, linux-perf-users@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, linux-arm-kernel@xxxxxxxxxxxxxxxxxxx, kvmarm@xxxxxxxxxxxxxxxxxxxxx, linux-csky@xxxxxxxxxxxxxxx, linux-riscv@xxxxxxxxxxxxxxxxxxx, kvm@xxxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, Artem Kashkanov <artem.kashkanov@xxxxxxxxx>, Like Xu <like.xu.linux@xxxxxxxxx>, Like Xu <like.xu@xxxxxxxxxxxxxxx>, Zhu Lingshan <lingshan.zhu@xxxxxxxxx>
- Delivery-date: Thu, 11 Nov 2021 07:27:35 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
On 11/11/21 03:07, Sean Christopherson wrote:
Protect perf_guest_cbs with RCU to fix multiple possible errors. Luckily,
all paths that read perf_guest_cbs already require RCU protection, e.g. to
protect the callback chains, so only the direct perf_guest_cbs touchpoints
need to be modified.
Bug #1 is a simple lack of WRITE_ONCE/READ_ONCE behavior to ensure
perf_guest_cbs isn't reloaded between a !NULL check and a dereference.
Fixed via the READ_ONCE() in rcu_dereference().
Bug #2 is that on weakly-ordered architectures, updates to the callbacks
themselves are not guaranteed to be visible before the pointer is made
visible to readers. Fixed by the smp_store_release() in
rcu_assign_pointer() when the new pointer is non-NULL.
Bug #3 is that, because the callbacks are global, it's possible for
readers to run in parallel with an unregisters, and thus a module
implementing the callbacks can be unloaded while readers are in flight,
resulting in a use-after-free. Fixed by a synchronize_rcu() call when
unregistering callbacks.
Bug #1 escaped notice because it's extremely unlikely a compiler will
reload perf_guest_cbs in this sequence. perf_guest_cbs does get reloaded
for future derefs, e.g. for ->is_user_mode(), but the ->is_in_guest()
guard all but guarantees the consumer will win the race, e.g. to nullify
perf_guest_cbs, KVM has to completely exit the guest and teardown down
all VMs before KVM start its module unload / unregister sequence. This
also makes it all but impossible to encounter bug #3.
Bug #2 has not been a problem because all architectures that register
callbacks are strongly ordered and/or have a static set of callbacks.
But with help, unloading kvm_intel can trigger bug #1 e.g. wrapping
perf_guest_cbs with READ_ONCE in perf_misc_flags() while spamming
kvm_intel module load/unload leads to:
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 6 PID: 1825 Comm: stress Not tainted 5.14.0-rc2+ #459
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:perf_misc_flags+0x1c/0x70
Call Trace:
perf_prepare_sample+0x53/0x6b0
perf_event_output_forward+0x67/0x160
__perf_event_overflow+0x52/0xf0
handle_pmi_common+0x207/0x300
intel_pmu_handle_irq+0xcf/0x410
perf_event_nmi_handler+0x28/0x50
nmi_handle+0xc7/0x260
default_do_nmi+0x6b/0x170
exc_nmi+0x103/0x130
asm_exc_nmi+0x76/0xbf
Fixes: 39447b386c84 ("perf: Enhance perf to allow for guest statistic collection
from host")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
Reviewed-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
One nit:
EXPORT_SYMBOL_GPL(perf_register_guest_info_callbacks);
int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *cbs)
{
- perf_guest_cbs = NULL;
+ if (WARN_ON_ONCE(rcu_access_pointer(perf_guest_cbs) != cbs))
+ return -EINVAL;
+
+ rcu_assign_pointer(perf_guest_cbs, NULL);
+ synchronize_rcu();
This technically could be RCU_INIT_POINTER but it's not worth a respin.
There are dozens of other occurrences, and if somebody wanted they
could use Coccinelle to fix all of them.
Paolo
|