Xen project Mailing List

Re: [Xen-devel] Help with: Xen BUG at timer.c:189

From: Dietmar Hahn <dietmar.hahn@xxxxxxxxxxxxxx>

Date: Fri, 09 Jun 2017 09:24:22 +0200

Delivery-date: Fri, 09 Jun 2017 07:24:28 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Am Mittwoch, 7. Juni 2017, 04:28:33 schrieb Jan Beulich: > >>> On 07.06.17 at 11:28, <dietmar.hahn@xxxxxxxxxxxxxx> wrote: > > Am Freitag, 19. Mai 2017, 06:41:36 schrieb Jan Beulich: > >> >>> On 19.05.17 at 11:52, <dietmar.hahn@xxxxxxxxxxxxxx> wrote: > >> > I'am struggling with a hypervisor panic. The hypervisor version is 4.4.3, > >> > yes I know - very old ;-), but the affected code hasn't much changed. > >> > >> Well, at the very least I'd expect you to base your code on > >> 4.4.4 plus the about 199 backports we've added there over time > >> (assuming yours is based on ours), some of which are scheduler > >> ones. > > > > Don't you have older versions in the field? > > We certainly do, but when people run into problems first thing > support is supposed to do is tell them to update. > > >> > The panic was never seen before and occured while booting the system, > >> > only dom0 was running. > >> > > >> > (XEN) Xen BUG at timer.c:189 > >> > (XEN) ----[ Xen-4.4.3_02-26.2_x3 x86_64 debug=n Tainted: C ]---- > >> > (XEN) CPU: 14 > >> > (XEN) RIP: e008:[<ffff82d08012c3e6>] remove_entry+0x26/0x80 > >> > (XEN) RFLAGS: 0000000000010086 CONTEXT: hypervisor > >> > (XEN) rax: 00000000000000bc rbx: ffff830839bcd5c8 rcx: > >> > ffff82d080343a80 > >> > (XEN) rdx: 000000000000d5f0 rsi: 0000002887fa0000 rdi: > >> > ffff830839bcd5c8 > >> > (XEN) rbp: 000000000000d5f0 rsp: ffff830839bbfe20 r8: > >> > 0000002887211398 > >> > (XEN) r9: 000000288721132f r10: 00000000bf8b2994 r11: > >> > ffff82d08011cfa0 > >> > (XEN) r12: ffff830839bcd5c8 r13: ffff82d080343a80 r14: > >> > 000000000000d5f0 > >> > (XEN) r15: ffff82d080343a80 cr0: 000000008005003b cr4: > >> > 00000000001526f0 > >> > (XEN) cr3: 0000000075679000 cr2: 000000000069b8ec > >> > (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 > >> > (XEN) Xen stack trace from rsp=ffff830839bbfe20: > >> > (XEN) ffff830839bcb020 ffff82d08012d3a8 0000002887fa0000 > >> > 0000000000000286 > >> > (XEN) ffff830839bcd5f0 ffff830839bcd9c0 0000002887c1915d > >> > 000000000000000e > >> > (XEN) ffff82d080343a70 0000000000000002 ffff830839bcda30 > >> > ffff82d0801affd0 > >> > (XEN) 000000010000000e 00000028872113f4 ffffffffffffffff > >> > ffff82d08030fc00 > >> > (XEN) ffff82d0802ffc00 ffff82d080189a1c 0000000000000000 > >> > 0000000000000000 > >> > (XEN) ffffffffffffffff 0000107800076e92 ffff82d0801292aa > >> > ffff830839bb8000 > >> > (XEN) ffff8300754fd000 ffff82d08031abe0 0000000000000000 > >> > ffff830839bcb048 > >> > (XEN) 0000000000000000 ffff82d080161ad6 0000000000000001 > >> > 0000000000000000 > >> > (XEN) 0000000000000000 0000000000000000 0000000000000000 > >> > 0000000000000009 > >> > (XEN) ffff8803dda20010 0000000000000246 0000000000000000 > >> > 00000000fffffffe > >> > (XEN) 0000000000000000 0000000000000000 ffffffff800033aa > >> > 0000000000000000 > >> > (XEN) 0000000000000000 0000000000000001 0000010000000000 > >> > ffffffff800033aa > >> > (XEN) 000000000000e033 0000000000000246 ffff8803dda21f08 > >> > 000000000000e02b > >> > (XEN) 0000000000000000 0000000000000000 0000000000000000 > >> > 0000000000000000 > >> > (XEN) 000000000000000e ffff8300754fd000 00000037b9887700 > >> > 0000000000000000 > >> > (XEN) Xen call trace: > >> > (XEN) [<ffff82d08012c3e6>] remove_entry+0x26/0x80 > >> > (XEN) [<ffff82d08012d3a8>] set_timer+0x178/0x230 > >> > (XEN) [<ffff82d0801affd0>] mwait_idle+0x270/0x370 > >> > (XEN) [<ffff82d080189a1c>] local_time_calibration+0x19c/0x1a0 > >> > (XEN) [<ffff82d0801292aa>] __do_softirq+0x6a/0xa0 > >> > (XEN) [<ffff82d080161ad6>] idle_loop+0x16/0x50 > >> > (XEN) > >> > (XEN) > >> > (XEN) **************************************** > >> > (XEN) Panic on CPU 14: > >> > (XEN) Xen BUG at timer.c:189 > >> > (XEN) **************************************** > >> > (XEN) > >> > (XEN) Reboot in five seconds... > >> > (XEN) Executing kexec image on cpu14 > >> > (XEN) Shot down all CPUs > >> > > >> > This is the call of BUG() in remove_entry() in case timer->status is not > >> > TIMER_STATUS_in_heap or TIMER_STATUS_in_list. > >> > > >> > With the address of the timer from the vmcore: > >> > crash> struct timer ffff830839bcd5c8 > >> > struct timer { > >> > expires = 0, > >> > { > >> > heap_offset = 968676784, > >> > list_next = 0xffff830839bcd5b0, > >> > inactive = { > >> > next = 0xffff830839bcd5b0, > >> > prev = 0x200200200000050 > >> > } > >> > }, > >> > function = 0xffff830839bcd5e0, > >> > data = 0xffff830839bcd5e0, > >> > cpu = 54768, > >> > status = 188 '\274' > >> > } > >> > > >> > This looks like the structure was overwritten. > >> > > >> > The call path seems to be: > >> > mwait_idle() > >> > sched_tick_resume() > >> > csched_tick_resume() > >> > set_timer() > >> > remove_timer() > >> > > >> > In sched_tick_resume(): > >> > unsigned int cpu = smp_processor_id(); > >> > > >> > I extracted struct cpu_info from the stack and found: > >> > crash> cpu_info ffff830839bbff18 > >> > struct cpu_info { > >> > ... > >> > processor_id = 14, > >> > current_vcpu = 0xffff8300754fd000, > >> > per_cpu_offset = 0x37b9887700, > >> > ... > >> > > >> > And from the code in sched_credit.c: spc = CSCHED_PCPU(cpu); > >> > I calculated spc: > >> > > >> > crash> csched_pcpu 0xffff830839be2b40 > >> > struct csched_pcpu { > >> > runq = { > >> > next = 0xffff830839be2b40, > >> > prev = 0xffff830839be2b40 > >> > }, > >> > runq_sort_last = 0, > >> > ticker = { > >> > expires = 174080000000, > >> > { > >> > heap_offset = 1, > >> > list_next = 0x100100100000001, > >> > inactive = { > >> > next = 0x100100100000001, > >> > prev = 0x200200200200200 > >> > } > >> > }, > >> > function = 0xffff82d08011f200 <csched_tick>, > >> > data = 0xe, > >> > cpu = 14, > >> > status = 3 '\003' > >> > }, > >> > tick = 0, > >> > idle_bias = 47, > >> > balance_mask = 0xffff830839be2bb0 > >> > } > >> > > >> > And this ticker looks ok! > >> > So the ticker address extracted from the vmcore in remove_entry() > >> > is not the same as the expected one read from the code. > >> > >> Right, and looking through raw stack contents I can't seem to > >> find any slot holding 0xffff830839be????. If this was a hypervisor > >> binary we delivered, I would even have gone and checked which > >> slots correspond to saved copies of pointers derived from what > >> was supposedly passed into set_timer() (rooted at > >> CSCHED_PCPU(14)), but this looks to be a crash with a binary > >> you've produced yourself. Quite likely corruption starts elsewhere. > > > > I disassembled the affected code and found that the stack is not used that > > much. > > Right, hence my reference to saved pointers (i.e. callee saved > registers put on the stack in function prologues). > > > The only usable value is ffff830839bcd5f0. > > There are a few more ffff830839bc???? ones, and it is at least > interesting that CPU14's stack is at ffff830839bb???? and the > uncorrupted struct csched_pcpu is at ffff830839be????. What happens on the stack: mwait_idle() sub $0x88,%rsp ... callq 0xffff82d0801253f0 <sched_tick_resume> (in sched_tick_resume) -> ffff82d0801affd0 -> symbol mwait_idle+624 ... jmp *%r11 -> csched_tick_resume (in csched_tick_resume) sub $0x18,%rsp ... add $0x18,%rsp jmpq 0xffff82d08012d230 <set_timer> (in set_timer) sp = sp - 0x48 mov %r12,0x28(%rsp) -> ffff830839bcda30 -> from mwait_idle() mov %rbx,0x18(%rsp) -> 0000000000000002 mov %rbp,0x20(%rsp) -> ffff82d080343a70 -> per_cpu__tasklet_work_to_do mov %r13,0x30(%rsp) -> 000000000000000e mov %r14,0x38(%rsp) -> 0000002887c1915d mov %r15,0x40(%rsp) -> ffff830839bcd9c0 -> from mwait_idle() -> struct acpi_processor_cx* mov %rax,0x10(%rsp) -> ffff830839bcd5f0 -> address &(timer->cpu) pushfq popq 0x8(%rsp) -> 0000000000000286 mov %rsi,(%rsp) -> 0000002887fa0000 -> argument "expires" call remove_entry() (in remove_entry) -> ffff82d08012d3a8 -> symbol set_timer+376 push %rbx -> ffff830839bcb020 The spc from CSCHED_PCPU(cpu) is fully handled in registers. Below is the complete stack. cpu_info starts at ffff830839bbff18. ffff830839bbfe20: ffff830839bcb020 set_timer+376 ffff830839bbfe30: 0000002887fa0000 0000000000000286 ffff830839bbfe40: ffff830839bcd5f0 ffff830839bcd9c0 ffff830839bbfe50: 0000002887c1915d 000000000000000e ffff830839bbfe60: per_cpu__tasklet_work_to_do 0000000000000002 ffff830839bbfe70: ffff830839bcda30 mwait_idle+624 ffff830839bbfe80: 000000010000000e 00000028872113f4 ffff830839bbfe90: ffffffffffffffff softirq_handlers ffff830839bbfea0: irq_stat local_time_calibration+412 ffff830839bbfeb0: 0000000000000000 0000000000000000 ffff830839bbfec0: ffffffffffffffff 0000107800076e92 ffff830839bbfed0: __do_softirq+106 ffff830839bb8000 ffff830839bbfee0: ffff8300754fd000 __per_cpu_offset ffff830839bbfef0: 0000000000000000 ffff830839bcb048 ffff830839bbff00: 0000000000000000 idle_loop+22 ffff830839bbff10: 0000000000000001 0000000000000000 ffff830839bbff20: 0000000000000000 0000000000000000 ffff830839bbff30: 0000000000000000 0000000000000009 ffff830839bbff40: ffff8803dda20010 0000000000000246 ffff830839bbff50: 0000000000000000 00000000fffffffe ffff830839bbff60: 0000000000000000 0000000000000000 ffff830839bbff70: ffffffff800033aa 0000000000000000 ffff830839bbff80: 0000000000000000 0000000000000001 ffff830839bbff90: 0000010000000000 ffffffff800033aa ffff830839bbffa0: 000000000000e033 0000000000000246 ffff830839bbffb0: ffff8803dda21f08 000000000000e02b ffff830839bbffc0: 0000000000000000 0000000000000000 ffff830839bbffd0: 0000000000000000 0000000000000000 ffff830839bbffe0: 000000000000000e ffff8300754fd000 ffff830839bbfff0: 00000037b9887700 0000000000000000 ffff830839bc0000: 0000000000000000 0000000000000000 In ffff830839bbffe0 we can see cpu number 0xe. All seem to be fine. > > Also this value is surprisingly close to the garbage values in the > apparently corrupted struct timer. Can you rule out that you've > had a stack overflow somewhere (iirc these lead to immediate > crashes only when using debug builds)? As mentioned I analyzed the csched_pcpu areas of all processors. csched_pcpu of cpu 15 is ffff830839bcd550 - ffff830839bcd59f and looks ok. The wrong timer address ffff830839bcd5c8 is shortly behind. IMHO not the stack is the problem but the address of struct timer given to set_timer(). But this address is calculated with the cpu number from smp_processor_id(), which looks ok, and CSCHED_PCPU(cpu) looking ok too. So I have no idea. > > > And this is &timer->cpu used > > (and moved to the stack) in timer_lock() in the macro timer_lock_irqsave(). > > Calculating back to the struct timer this is our wrong address from the > > panic. > > This would mean set_timer() was already called with the wrong address. > > > > On the other side I had a look at all 48 CSCHED_PCPU(cpu) areas and all > > seems > > to be fine. At least ticker->cpu correspondends always to the cpu used in > > __per_cpu_offset[cpu]. > > > > So we have a really short path where the error can happen I think: > > > > csched_tick_resume(const struct scheduler *ops, unsigned int cpu) > > ... > > spc = CSCHED_PCPU(cpu); > > ... > > set_timer(&spc->ticker, ...); > > > > As already said above cpu is from smp_processor_id() in csched_tick_resume() > > and looks fine. Very strange! > > Indeed. Has this been seen more than once, i.e. is there hope > that if you added some debugging code it could be caught earlier? As earlier mentioned this was never seen before and it happened unfortunately on a customer system. Dietmar. > > Jan > -- Company details: http://ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.