[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-ia64-devel] [RFC] MCA handler support for Xen/ia64
Hi all, This is a design memo of the MCA handler for Xen/ia64. We hope many reviews and many comments. 1. Basic design - The MCA/CMC/CPE handler of the Xen/ia64 makes use of Linux code as much as possible. - The CMC/CPE interruption is injected to dom0 for logging. This interruption is not injected to domU or domVTI. - If the MCA interruption is a TLB check, the MCA handler changes the MCA to a CMC interruption, and inject it to dom0. This interruption is not injected to domU or domVTi. - If the MCA interruption is not a TLB check, the MCA handler does not try to recover, and Xen/ia64 reboot. 2. Detail design 2.1 Initialization of MCA handler The processing sequence is basically as follows. 1) Clear the Rendez checkin flag for all cpus. 2) Register the rendezvous interrupt vector with SAL. 3) Register the wakeup interrupt vector with SAL. 4) Register the Xen/ia64 MCA handler with SAL. 5) Configure the CMCI/P vector and handler. Interrupts for CMC are per-processor, so AP CMC interrupts are setup in smp_callin() (smpboot.c). 6) Setup the MCA rendezvous interrupt vector. 7) Setup the MCA wakeup interrupt vector. 8) Setup the CPEI/P handler. 9) Initialize the areas set aside by the Xen/ia64 to buffer the platform/processor error states for MCA/CMC/CPE handling. 10) Read the MCA error record for logging (by Dom0) if Xen has been rebooted due to an unrecoverable MCA. 2.2 MCA handler (TLB error only) The processing sequence is basically as follows. 1) Get processor state parameter on existing PALE_CHECK. And purge TR and TC, and reload TR. 2) Call the ia64_mca_handler(). 3) Wait for checkin of slave processors. 4) Wakeup all the processors which are spinning in the rendezvous loop. 5) Get the MCA error record. And hold the MCA error record into Xen/ia64 for logging by dom0. 6) Clear the MCA error record. 7) Inject the external interruption of CMC to dom0. 8) Set IA64_MCA_CORRECTED to the ia64_sal_os_state struct. 9) Return to the SAL and resume the interrupted processing. 2.3 MCA handler (TLB error and the others error) The processing sequence is basically as follows. 1) Get processor state parameter on existing PALE_CHECK. And purge TR and TC, and reload TR. 2) Call the ia64_mca_handler(). 3) Wait for checkin of slave processors. 4) Wakeup all the processors which are spinning in the rendezvous loop. 5) Get the MCA error record. And save the MCA error record into Xen/ia64 for logging by dom0 after reboot. [*1] 6) Return to the SAL and reboot the Xen/ia64. 2.4 MCA handler (Not TLB error) The processing sequence is basically as follows. 1) Get processor state parameter on existing PALE_CHECK. 2) Call the ia64_mca_handler(). 3) Wait for checkin of slave processors. 4) Wakeup all the processors which are spinning in the rendezvous loop. 5) Get the MCA error record. And save the MCA error record into Xen/ia64 for logging by dom0 after reboot. [*1] 6) Return to the SAL and reboot the Xen/ia64. 2.5 CMC handler The processing sequence is basically as follows. 1) Call the ia64_mca_cmc_int_handler() from the __do_IRQ() in the ia64_handle_irq(). 2) Get the MCA error record. And save the MCA error record into Xen/ia64 for logging by dom0 after reboot. [*1] 3) Inject the external interruption of CMC to dom0. 2.6 CPE handler Same as CMC. 2.7 SAL emulation for Dom0/DomU/DomVTI The following SAL emulation procedures are added. - SAL_SET_VECTORS - SAL_GET_STATE_INFO - SAL_GET_STATE_INFO_SIZE - SAL_CLEAR_STATE_INFO - SAL_MC_SET_PARAMS Note: [*1]: Actually, read the MCA error record again after the Xen/ia64 rebooted and log it with dom0. Best regards, Yutaka Ezaki Masaki Kanno _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |