|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Question about Xen reboot on panic
I think the machine_restart() may have a bug. :-(
2015-11-12 11:13 GMT-05:00 Meng Xu <xumengpanda@xxxxxxxxx>:
> Hi Andrew,
>
> I thought I might find where the system got stuck.
>
> As you suggested, I add several printks inside machine_restart();
> If the machine restart when Xen kernel crashes, I can see the following
> output:
>
> umount: /run/lock: not mounted
>
> umount: /run/shm: not mounted
>
> * Will now restart
>
> [ 122.261583] Restarting system.
>
> (XEN) Domain 0 shutdown: rebooting machine.
>
> (XEN) machine_restart start running
> (This is what I added at the first line of the machine_restart())
>
> (XEN) machine_restart start running
>
> (XEN) reboot_type=97
>
> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG.
>
> So when the machine reboots correctly at Xen kernel crash, the
> machine_restart will be called twice.
>
> After looking into the code, I found the following code in the
> machine_restart(), which is quite suspicious.
>
> if ( system_state >= SYS_STATE_smp_boot )
>
> {
>
> local_irq_enable();
>
>
> /* Ensure we are the boot CPU. */
>
> if ( get_apic_id() != boot_cpu_physical_apicid )
If we are at the boot CPU and the if statement return true
>
> {
>
> /* Send IPI to the boot CPU (logical cpu 0). */
>
> on_selected_cpus(cpumask_of(0), __machine_restart,
>
> &delay_millisecs, 0);
we will send an IPI from CPU 0 to CPU to run machine_restart.
>
> for ( ; ; )
>
> halt();
and CPU 0 will halt immediately.
If the IPI arrives later on CPU 0, CPU 0 won't be able to handle it,
since it has been halted.
*** I have one solution in my mind ***
Maybe we should check if the current CPU is CPU 0 by using
smp_processor_id(); The only concern I have is I'm not sure if the
machine_restart() will be rescheduled by Xen scheduler onto another
CPU after we run the smp_processor_id();
*** Result below confirms my guess***
If I print out the current CPU who sends out the IPI and the following
result confirms my speculation:
XEN) Reboot in five seconds...
(XEN) now we should see: before kexec_crash
(XEN) before kexec_crash
(XEN) after kexec_crash
(XEN) machine_restart start running, delay_millisecs=5000
(XEN) machine_restart: finished console_start_sync, system_state is 3
(XEN) On P0
As this line suggests, P0 sends P0 an IPI and P0 goes to halt immediately...
Thanks,
Meng
-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |