[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Question about Xen reboot on panic
I think the machine_restart() may have a bug. :-( 2015-11-12 11:13 GMT-05:00 Meng Xu <xumengpanda@xxxxxxxxx>: > Hi Andrew, > > I thought I might find where the system got stuck. > > As you suggested, I add several printks inside machine_restart(); > If the machine restart when Xen kernel crashes, I can see the following > output: > > umount: /run/lock: not mounted > > umount: /run/shm: not mounted > > * Will now restart > > [ 122.261583] Restarting system. > > (XEN) Domain 0 shutdown: rebooting machine. > > (XEN) machine_restart start running > (This is what I added at the first line of the machine_restart()) > > (XEN) machine_restart start running > > (XEN) reboot_type=97 > > (XEN) Resetting with ACPI MEMORY or I/O RESET_REG. > > So when the machine reboots correctly at Xen kernel crash, the > machine_restart will be called twice. > > After looking into the code, I found the following code in the > machine_restart(), which is quite suspicious. > > if ( system_state >= SYS_STATE_smp_boot ) > > { > > local_irq_enable(); > > > /* Ensure we are the boot CPU. */ > > if ( get_apic_id() != boot_cpu_physical_apicid ) If we are at the boot CPU and the if statement return true > > { > > /* Send IPI to the boot CPU (logical cpu 0). */ > > on_selected_cpus(cpumask_of(0), __machine_restart, > > &delay_millisecs, 0); we will send an IPI from CPU 0 to CPU to run machine_restart. > > for ( ; ; ) > > halt(); and CPU 0 will halt immediately. If the IPI arrives later on CPU 0, CPU 0 won't be able to handle it, since it has been halted. *** I have one solution in my mind *** Maybe we should check if the current CPU is CPU 0 by using smp_processor_id(); The only concern I have is I'm not sure if the machine_restart() will be rescheduled by Xen scheduler onto another CPU after we run the smp_processor_id(); *** Result below confirms my guess*** If I print out the current CPU who sends out the IPI and the following result confirms my speculation: XEN) Reboot in five seconds... (XEN) now we should see: before kexec_crash (XEN) before kexec_crash (XEN) after kexec_crash (XEN) machine_restart start running, delay_millisecs=5000 (XEN) machine_restart: finished console_start_sync, system_state is 3 (XEN) On P0 As this line suggests, P0 sends P0 an IPI and P0 goes to halt immediately... Thanks, Meng ----------- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |