[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen 4.7 crash
On Tue, Jun 14, 2016 at 09:38:22AM -0400, Aaron Cornelius wrote: > On 6/14/2016 9:26 AM, Aaron Cornelius wrote: > >On 6/14/2016 9:15 AM, Wei Liu wrote: > >>On Tue, Jun 14, 2016 at 09:11:47AM -0400, Aaron Cornelius wrote: > >>>On 6/9/2016 7:14 AM, Ian Jackson wrote: > >>>>Aaron Cornelius writes ("Re: [Xen-devel] Xen 4.7 crash"): > >>>>>I am not that familiar with the xenstored code, but as far as I can tell > >>>>>the grant mapping will be held by the xenstore until the xs_release() > >>>>>function is called (which is not called by libxl, and I do not > >>>>>explicitly call it in my software, although I might now just to be > >>>>>safe), or until the last reference to a domain is released and the > >>>>>registered destructor (destroy_domain), set by talloc_set_destructor(), > >>>>>is called. > >>>> > >>>>I'm not sure I follow. Or maybe I disagree. ISTM that: > >>>> > >>>>The grant mapping is released by destroy_domain, which is called via > >>>>the talloc destructor as a result of talloc_free(domain->conn) in > >>>>domain_cleanup. I don't see other references to domain->conn. > >>>> > >>>>domain_cleanup calls talloc_free on domain->conn when it sees the > >>>>domain marked as dying in domain_cleanup. > >>>> > >>>>So I still think that your acl reference ought not to keep the grant > >>>>mapping alive. > >>> > >>>It took a while to complete the testing, but we've finished trying to > >>>reproduce the error using oxenstored instead of the C xenstored. When the > >>>condition occurs that caused the error with the C xenstored (on > >>>4.7.0-rc4/8478c9409a2c6726208e8dbc9f3e455b76725a33), oxenstored does not > >>>cause the crash. > >>> > >>>So for whatever reason, it would appear that the C xenstored does keep the > >>>grant allocations open, but oxenstored does not. > >>> > >> > >>Can you provide some easy to follow steps to reproduce this issue? > >> > >>AFAICT your environment is very specialised, but we should be able to > >>trigger the issue with plan xenstore-* utilities? > > > >I am not sure if the plain xenstore-* utilities will work, but here are > >the steps to follow: > > > >1. Create a non-standard xenstore path: /tool/test > >2. Create a domU (mini-os/mirage/something small) > >3. Add the new domU to the /tool/test permissions list (I'm not 100% > >sure how to do this with the xenstore-* utilities) > > a. call xs_get_permissions() > > b. realloc() the permissions block to add the new domain > > c. call xs_set_permissions() > >4. Delete the domU from step 2 > >5. Repeat steps 2-4 > > > >Eventually the xs_set_permissions() function will return an E2BIG error > >because the list of domains has grown too large. Sometime after that is > >when the crash occurs with the C xenstored and the 4.7.0-rc4 version of > >Xen. It usually takes around 1200 or so iterations for the crash to occur. > > After writing up those steps I suddenly realized that I think I have a bug > in my test that might have been causing the bug in the first place. Once I > get errors returned from xs_set_permissions() I was not properly cleaning up > the created domains. So I think this was just a simple case of VMID > exhaustion by creating more than 255 domUs at the same time. > > In which case this is completely unrelated to xenstore holding on to grant > allocations, and the C xenstore most likely behaves correctly. > OK, so I will treat this issue as resolved for now. Let us know if you discover something new. Wei. > - Aaron Cornelius > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |