|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen: credit1: fix tickling when it happens from a remote pCPU
On 09/24/2015 06:31 AM, Dario Faggioli wrote: especially if that is also from a different cpupool than the processor of the vCPU that triggered the tickling. In fact, it is possible that we get as far as calling vcpu_unblock()--> vcpu_wake()-->csched_vcpu_wake()-->__runq_tickle() for the vCPU 'vc', but all while running on a pCPU that is different from 'vc->processor'. For instance, this can happen when an HVM domain runs in a cpupool, with a different scheduler than the default one, and issues IOREQs to Dom0, running in Pool-0 with the default scheduler. In fact, right in this case, the following crash can be observed: (XEN) ----[ Xen-4.7-unstable x86_64 debug=y Tainted: C ]---- (XEN) CPU: 7 (XEN) RIP: e008:[<ffff82d0801230de>] __runq_tickle+0x18f/0x430 (XEN) RFLAGS: 0000000000010086 CONTEXT: hypervisor (d1v0) (XEN) rax: 0000000000000001 rbx: ffff8303184fee00 rcx: 0000000000000000 (XEN) ... ... ... (XEN) Xen stack trace from rsp=ffff83031fa57a08: (XEN) ffff82d0801fe664 ffff82d08033c820 0000000100000002 0000000a00000001 (XEN) 0000000000006831 0000000000000000 0000000000000000 0000000000000000 (XEN) ... ... ... (XEN) Xen call trace: (XEN) [<ffff82d0801230de>] __runq_tickle+0x18f/0x430 (XEN) [<ffff82d08012348a>] csched_vcpu_wake+0x10b/0x110 (XEN) [<ffff82d08012b421>] vcpu_wake+0x20a/0x3ce (XEN) [<ffff82d08012b91c>] vcpu_unblock+0x4b/0x4e (XEN) [<ffff82d080167bd0>] vcpu_kick+0x17/0x61 (XEN) [<ffff82d080167c46>] vcpu_mark_events_pending+0x2c/0x2f (XEN) [<ffff82d08010ac35>] evtchn_fifo_set_pending+0x381/0x3f6 (XEN) [<ffff82d08010a0f6>] notify_via_xen_event_channel+0xc9/0xd6 (XEN) [<ffff82d0801c29ed>] hvm_send_ioreq+0x3e9/0x441 (XEN) [<ffff82d0801bba7d>] hvmemul_do_io+0x23f/0x2d2 (XEN) [<ffff82d0801bbb43>] hvmemul_do_io_buffer+0x33/0x64 (XEN) [<ffff82d0801bc92b>] hvmemul_do_pio_buffer+0x35/0x37 (XEN) [<ffff82d0801cc49f>] handle_pio+0x58/0x14c (XEN) [<ffff82d0801eabcb>] vmx_vmexit_handler+0x16b3/0x1bea (XEN) [<ffff82d0801efd21>] vmx_asm_vmexit_handler+0x41/0xc0 In this case, pCPU 7 is not in Pool-0, while the (Dom0's) vCPU being woken is. pCPU's 7 pool has a different scheduler than credit, but it is, however, right from pCPU 7 that we are waking the Dom0's vCPUs. Therefore, the current code tries to access csched_balance_mask for pCPU 7, but that is not defined, and hence the Oops. (Note that, in case the two pools run the same scheduler we see no Oops, but things are still conceptually wrong.) Cure things by providing a second macro allowing to fetch the scratch mask of a specific pCPU (instead than always using smp_processor_id()), and use that one in __runq_tickle(), with such pCPU equal to the processor of the vCPU being woken. Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx> Reviewed-by: Juergen Gross <jgross@xxxxxxxx> regardless whether you address my suggestion below or not. Wouldn't it make sense to get rid of that macro? After your patch it is used only in csched_runq_steal() which is called with cpu being always smp_processor_id(). You'd eliminate a possible source of future error and avoid multiple evaluation of smp_processor_id(). Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |