|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: NULL scheduler DoS
On 09/08/2021 17:19, Ahmed, Daniele wrote: Hi all, Hi Daniele, Thank you for the report! The NULL scheduler is affected by an issue that triggers an assertion and reboots the hypervisor. Just to make clear for the others in the thread, per SUPPORT.MD, the NULL scheduler is not security supported. Hence why this is sent to xen-devel directly. Also, for completeness, debug build are also not security supported. On production build, the ASSERT() would be turned to a NOP which could result to potentially more interesting issue. Anyway, that's not a problem here. :) I am not quite too sure where the problem lies yet but adding some more information of the debugging we discussed together. The ASSERT() is triggered because the pCPU was already assigned to one of the dom0 vCPU. This problem is happening regardless whether there is free pCPU. I have added some debugging in sched_set_res(): diff --git a/xen/common/sched/private.h b/xen/common/sched/private.h index a870320146ef..2355f531dc13 100644 --- a/xen/common/sched/private.h +++ b/xen/common/sched/private.h@@ -150,6 +150,10 @@ static inline void sched_set_res(struct sched_unit *unit,
unsigned int cpu = cpumask_first(res->cpus);
struct vcpu *v;
+ printk("%s: res->master_cpu %u unit %p %pd %pv\n", __func__,
+ res->master_cpu, unit, unit->domain, unit->vcpu_list);
+ WARN();
+
for_each_sched_unit_vcpu ( unit, v )
{
ASSERT(cpu < nr_cpu_ids);
This traced the problem to null_unit_migrate():
(XEN) sched_set_res: res->master_cpu 0 unit ffff830200887f00 d1 d1v0
(XEN) Xen WARN at private.h:155
(XEN) ----[ Xen-4.16-unstable x86_64 debug=y Tainted: C ]----
(XEN) CPU: 1
(XEN) RIP: e008:[<ffff82d04023fd9f>] core.c#sched_set_res+0x5b/0xc6
(XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor (d0v1)
(XEN) rax: ffff83027bf55038 rbx: 0000000000000000 rcx: 0000000000000000
(XEN) rdx: ffff83027bf4ffff rsi: 000000000000000a rdi: ffff82d0404944b8
(XEN) rbp: ffff83027bf4fc70 rsp: ffff83027bf4fc40 r8: 0000000000000004
(XEN) r9: 0000000000000030 r10: ffff83027bf4fcf8 r11: 00000000fffffffd
(XEN) r12: ffff830275e83000 r13: ffff830275e8d000 r14: ffff830200887f00
(XEN) r15: ffff83027bf850a0 cr0: 0000000080050033 cr4: 00000000003526e0
(XEN) cr3: 00000001f1e3d000 cr2: 0000563f71516088
(XEN) fsb: 00007f6561cda780 gsb: ffff88817fe80000 gss: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
(XEN) Xen code around <ffff82d04023fd9f> (core.c#sched_set_res+0x5b/0xc6):
(XEN) 14 18 00 e8 e7 7f 00 00 <0f> 0b 4d 8b 66 08 4d 85 e4 75 28 4d 89
7e 20 48
(XEN) Xen stack trace from rsp=ffff83027bf4fc40:
(XEN) ffff83027bf85118 ffff830200887f00 ffff830275e83000 ffff830275e8d000
(XEN) 0000000000000000 ffff83027bf552a0 ffff83027bf4fce0 ffff82d040241614
(XEN) ffff82d040226393 0000000000000286 ffff83027bf822e8 0000000175e8d000
(XEN) ffff830200887f00 ffff83027bf552a0 ffff830275e83000 ffff83027bf4fcf8
(XEN) ffff830275e8d000 ffff830275e8d000 0000000000000001 0000000000000000
(XEN) ffff83027bf4fd40 ffff82d04020527d ffff82d04020527d 0000000000000000
(XEN) 0000000000000000 ffff83027bf4fd30 0000000000000000 0000000000000000
(XEN) ffff830275e8d000 00007f65620f6010 0000000000000001 ffff82d040238319
(XEN) ffff83027bf4fe58 ffff82d040238dd9 00000000001f1eae 0000000000000004
(XEN) ffff83027bee4001 8000000000000000 ffff83027bf4fdc0 ffff82d04032e6df
(XEN) 000000044032e6df 0000000000000000 ffff82e003e3e120 000000140000000f
(XEN) 00007f6561d90001 0000559a00000001 0000000000000014 0000559ad9c303e0
(XEN) 0000000000000008 0000559ad9c303e0 0000559ad9c31170 0000559ad9c303c0
(XEN) 0000000000000000 00007ffd4ed54b60 0000559ad9c309a0 00007ffd4ed54c50
(XEN) 0000000000000000 0000559ad9c38240 0000559ad9c32570 00007ffd4ed54f00
(XEN) 0000559ad9c31170 ffff83027bf4fef8 0000000000000000 0000000000000001
(XEN) deadbeefdeadf00d ffff83027bec0000 ffff82d040238319 ffff83027bf4fee8
(XEN) ffff82d04030d8bc 00007f65620f6010 deadbeefdeadf00d deadbeefdeadf00d
(XEN) deadbeefdeadf00d deadbeefdeadf00d ffff82d04038821c ffff82d040388228
(XEN) ffff82d04038821c ffff82d040388228 ffff82d04038821c ffff82d040388228
(XEN) Xen call trace:
(XEN) [<ffff82d04023fd9f>] R core.c#sched_set_res+0x5b/0xc6
(XEN) [<ffff82d040241614>] F sched_init_vcpu+0x3dc/0x5d7
(XEN) [<ffff82d04020527d>] F vcpu_create+0xfb/0x37a
(XEN) [<ffff82d040238dd9>] F do_domctl+0xac0/0x184a
(XEN) [<ffff82d04030d8bc>] F pv_hypercall+0x10d/0x2b8
(XEN) [<ffff82d04038829d>] F lstar_enter+0x12d/0x140
(XEN)
The end of the function contains an interesting comment:
/*
* Whatever all the above, we always at least override v->processor.
* This is especially important for shutdown or suspend/resume paths,
* when it is important to let our caller (cpu_disable_scheduler())
* know that the migration did happen, to the best of our
possibilities,
* at least. In case of suspend, any temporary inconsistency caused
* by this, will be fixed-up during resume.
*/
This implies that a pCPU may temporarily be assigned to two vCPUs and we
expect to be fixed up afterwards. However, a domain may be destroyed
before this is happening.
So it looks like that unit_deassign() is not able to cope with this case. From a brief look, I think we may want to check if the pCPU is in the wait list. If it is, then we should bail out. Dario, Stefano, what do you think? Cheers, -- Julien Grall
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |