[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Kernel panic with tboot E820_UNUSABLE region



On 14/05/13 12:06, Aurelien Chartier wrote:
> Hi everybody,
> 
> We noticed a crash in Linux dom0 early boot sequence when running over
> tboot and Xen. The issue seemed related with a E820 region that tboot is
> setting as E820_UNUSABLE. We posted to tboot-devel to understand better
> what could be the cause of the kernel panic. This thread can be read
> here :
> http://sourceforge.net/mailarchive/forum.php?thread_name=51852B26.7070406%40citrix.com&forum_name=tboot-devel
> 
> Following Konrad's advice, we took a closer look at arch/x86/xen/setup.c
> and found what could be the cause of the kernel panic. I am not familiar
> with that part of Xen, so feel free to correct me.
> 
> The Xen memory setup code called during early boot is trying to release
> chunks of memory in xen_set_identity_and_release for non-RAM regions
> (including E820_UNUSABLE). The xen_set_identity_and_release_chunk
> function is calling HYPERVISOR_update_va_mapping, which will fail in our
> case. As tboot marked that region as being unusable, Xen did not map
> those pages and the later call on get_page_from_l1e (arch/x86/mm.c in
> Xen code) is returning an error.  As the return value of the hypercall
> is not checked in Linux code, xen_set_identity_and_release_chunk
> function is carrying on and tries to release the E820_UNUSABLE chunk.
> This is apparently messing up some Xen internal memory structures,
> resulting in a kernel crash when Linux is initializing its memory mapping.

That does not sound quite right to me.  xen_set_identity_and_release()
is releasing RAM pfns that overlap with holes in the machine memory map
and get_page_from_l1e() should always succeed.  The fact that they're
overlapping with something marked as UNUSABLE shouldn't matter since its
no different from any other of the holes.

Is tboot causing Xen to do something weird like leaving holes in dom0's
initial memory allocation?

I would also check what max_pfn_mapped is.  Perhaps it's miscalculated
and were trying to update mappings that don't exit?

David

> 
> A possible fix I have tried is to check the return value of
> HYPERVISOR_update_va_mapping and if encountering an error, exit from
> xen_set_identity_and_release_chunk. This is fixing the kernel panic, but
> I am not sure about other implications by that change.
> 
> Any ideas about this issue ?
> 
> Thanks in advance,
> Aurelien

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.