[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-ia64-devel] EFI Mapping Windows Install Crash Bug
On Tue, Jul 01, 2008 at 05:34:42PM +1000, Simon Horman wrote: > On Tue, Jul 01, 2008 at 04:07:53PM +0900, Isaku Yamahata wrote: > > On Tue, Jul 01, 2008 at 11:03:28AM +1000, Simon Horman wrote: > > > Hi, > > > > > > I'm a bit hesitant to jump the gun, but I think that I might have > > > isolated the cause of win2k3-sp2 crashing during install when my EFI > > > Mapping patches are applied. Well, perhaps not the cause, but I think I > > > know where it is dying. > > > > > > Quickly as background, the EFI Mapping parches move the mapping > > > that EFI is taught on boot time to map memory where Linux places > > > it ( basically pa + (0xe<60) ) instead of where Xen usually > > > places it ( basically pa + (0xf<60) ). In order to protect this > > > mapping from HVM domains a special region id is used. The > > > hypervisor switches to that region id just before making any > > > PAL, SAL or EFI calls, and switches back to the previous region > > > id once the call completes. As region 7 has to be changed, > > > entries that are pinned into the TLB have to be repinned. And > > > that is roughly where the fun begins. > > > > > > As for the problem? It seems to be caused by ia64_mca_cpe_int_caller() > > > calling ia64_log_queue() which calls ia64_sal_get_state_info(). I > > > believe that the hypervisor dies in ia64_log_queue() somewhere after > > > ia64_sal_get_state_info() completes. That is, I am suspecting that the > > > call to ia64_sal_get_state_info() is returning bogus data. > > > > Is ia64_mca_cpe_int_caller() called in interrupt context? > > If so, ia64_log_queue() calls xmalloc() which can't be called > > from interrupt context. Then xen VMM crashes at ASSERT(!in_irq()) > > in _xmalloc(). > > That is a good point. Although xmalloc() is only called if > ia64_sal_get_state_info() returns a non-zero value. Which > according to tracing that I have done this afternoon, does > not seem to be the case (when ia64_log_queue() is called > from other places in mca.c. > > How can I check if the call is being made in interrupt context? in_irq()? Anyway I noticed ia64_mca_cpe_int_caller() is a irq handler so that it is always called from intrrupt context. So ia64_log_queue() has to be fixed in case ia64_sal_get_state_info() returns > 0. > Also, after some more investigation, I now believe that the hypervisor > is locking up inside ia64_sal_get_state_info() not later on in > ia64_log_queue() as I thought this morning. > > > > Furthermore, my traces seem to indicate that the problem arises the > > > call to ia64_log_queue() and in turn to ia64_sal_get_state_info() is > > > made when the region id is already switched to make some other PAL, SAL > > > or EFI call (though I doubt it is particularly important which one). > > > > > > This seems to make sense to me as ia64_mca_cpe_int_caller() is > > > "Triggered by sw interrupt from CPE polling routine.". > > > > > > I am unsure about what to do about this problem, but for testing > > > purposes I simply removed the call to ia64_log_queue() from > > > ia64_mca_cpe_int_caller() and things seem to work. > > > > > > When I say seem to work, this bug does not manifest every time I install > > > win2k3-sp2. So it can be hard to tell if a change has improved things or > > > not. But for now, I have not seen a crash occur with this hack in place > > > (+ various other changes which may or may not be relevant, but this one > > > seems to be particularly important). > > > > > > I will investigate my theory that things die in ia64_log_queue() > > > further. But I wonder if there might be a way to permanently remove/move > > > the call to ia64_log_queue() out of ia64_mca_cpe_int_caller() and > > > possibly other PAL, SAL or EFI calls inside other MCA code. > > > > > > -- yamahata _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |