 
	
| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Xen Crashes when releasing gnttab mappings - of a crashed domain.
 
Observation:
------------
When connecting two miniOs (using a shared ring), Xen (not a domain)
crashes when the miniOs's exits..
Xen crashes and produces the following: 
(XEN) Xen call trace:
(XEN)    [<ff11d20d>] __bug+0x29/0x45
(XEN)    [<ff107cb3>] gnttab_release_mappings+0xcb/0x2e5
(XEN)    [<ff1046dd>] domain_kill+0x29/0x62
(XEN)    [<ff10349a>] do_domctl+0x6d6/0xfbc
(XEN)    [<ff165755>] hypercall+0x95/0xb5
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 1:
(XEN) BUG at grant_table.c:1122
(XEN) ****************************************
The cause:
----------
Xen tries to release the grant table mappings by accessing a remote
domain grant table. 
But the remote domain seems to be non-existent and consequently Xen
fails:
find_domain_by_id (in gnttab_release_mappings) returns NULL.
Analysis:
---------
This situation described above should never happen: if I understand
correctly, a domain should not be completely destroyed until there are
no more references to it.
See: put_domain(d) // sched.h
Which is defined as follows:
If ( atomic_dec_and_test( &(_d_->refcnt) ) domain_destroy(_d)
It does however happen when a domain crashes.
Note that there are two ways to "finish" with a domain (domain.c):
1.      domain_kill (which calls domain_destroy) - releases all
resources in a gracefull 
      manner.
2.      __domain_crash (which calls domain_shutdown) - which seems to
kill the domain 
      without proper releasing of resources that reference to it.. 
     (this function is called on extreme cases)
Our scenario:
-------------
We are running two miniOs with the same profile:
Open a ring (share a page with a grant ref and map a page from a remote
domain)
Write
Read
Close the ring (dealloc, unmap*)
do_exit()
Timeline - > 
MiniOs 1:  ..........         calls do_exit() -> 
                                     .. domain_kill() -> 
                                            .. gnttab_release_mapping()
-> 
                                                    .. BUG()
MiniOs 2:    crashes**
                 
*When we unmap we use Xen's hypercall to unmap a grant reference 
and the gnttab_unmap_grant_ref structure.
Note that we have a bug and do NOT set unmap_op.dev_bus_addr to 0 as we
should.
Xen's API (in public/grant_table.h) explicitly describes that it should
be 0 or 
the grant reference will be treated as valid device mapping. 
** Because of the bug descrived in * we cause the domain to crash.
We observe:
(XEN) grant_table.c:394: Bad frame number doesn't match gntref
(XEN) mm.c:760: Attempt to implicitly unmap a granted PTE 
(XEN) domain_crash called from mm.c:761 
Summary:
-----------
1. Setting unmap_op.dev_bus_addr removes the BUG and all is well.
2. But crashing Xen - even with our error - doesn't seem to be a healthy
choice.
:) 
Micha.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
 
 
 | 
|  | Lists.xenproject.org is hosted with RackSpace, monitoring our |