[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen BUG at page_alloc.c:1738 (Xen 4.5)



On Fri, 29 May 2015, Andrew Cooper wrote:

> On 29/05/15 12:17, M A Young wrote:
> >
> >>> I did a bit of testing - xen-4.5.1-rc1 built on Fedora 22 (gcc5) doesn't 
> >>> boot for me, but if I replace xen.gz with one from the same code built on 
> >>> Fedora 21 (gcc4) then it does boot. There are rpms and build logs 
> >>> available via 
> >>> http://copr.fedoraproject.org/coprs/myoung/xentest/build/93366/
> >>> if anyone else wants to do some testing.
> >>>
> >>>   Michael Young
> >> Do you have easy access to xen-syms from each build?
> > Yes.
> >
> 
> Thankyou very much.
> 
> GCC 5 is indeed miscompiling the code. Comparing the fc21 vs fc22 builds:
> 
> The C snippet from mmio_ro_do_page_fault():
> 
> struct page_info *page = mfn_to_page(mfn);
> struct domain *owner = page_get_owner_and_reference(page);
> if ( owner )
>     put_page(page);
> 
> In fc21 is:
> 
> movabs $0xffff82e000000000,%rbp
> shr    %cl,%rax
> or     %rdx,%rax
> shl    $0x5,%rax
> add    %rax,%rbp
> mov    %rbp,%rdi
> callq  ffff82d080186900 <page_get_owner_and_reference>
> test   %rax,%rax
> mov    %rax,%r12
> je     ffff82d080189c4e <mmio_ro_do_page_fault+0x11e>
> mov    %rbp,%rdi
> callq  ffff82d080188ec0 <put_page>
> 
> and in fc22 is:
> 
> movabs $0xffff82e000000000,%r8
> shr    %cl,%rax
> or     %rdx,%rax
> shl    $0x5,%rax
> lea    (%r8,%rax,1),%rdi
> callq  ffff82d0801874f0 <page_get_owner_and_reference>
> test   %rax,%rax
> mov    %rax,%rbp
> je     ffff82d08018ca14 <mmio_ro_do_page_fault+0x114>
> mov    %r8,%rdi
> callq  ffff82d080189a90 <put_page>
> 
> "lea (%r8,%rax,1),%rdi" in FC22 is slightly shorter than "add %rax,%rbp;
> mov %rbp,%rdi" in FC21.  In both cases %rdi is now 'page' from the C
> snippet.
> 
> In FC21, the result is stored in %rbp, then reloaded from %rbp into %rdi
> for call to put_page().
> 
> However, in FC22, the result of the calculation is only held in %rdi,
> and clobbered by the call to page_get_owner_and_reference().  When it
> comes to call put_page(), %r8 is reloaded, which is still a pointer to
> the base of the frametable, not the page we actually took a reference on.
> 
> FC22 is miscompiling the C to:
> 
> struct page_info *page = mfn_to_page(mfn);
> struct domain *owner = page_get_owner_and_reference(page);
> if ( owner )
>     put_page(mfn_to_page(0));
> 
> which is wrong, and why free_domheap_pages() does legitimately complain
> about the wonky refcount.

With a bit of experimentation I have found that compiling with the 
-fno-caller-saves flag gets this code segment back to the Fedora 21 
version, thus avoiding the bug.

        Michael Young

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.