[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [BUG] xen-4.5: xenstored crash when stopping a self-restarting, crashing, domU


  • To: Xen Developers List <xen-devel@xxxxxxxxxxxxx>
  • From: "Greg A. Woods" <woods@xxxxxxxxx>
  • Date: Sat, 23 May 2015 13:03:44 -0700
  • Delivery-date: Tue, 26 May 2015 02:07:45 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>

I made the mistake of configuring a domU with "on_crash=restart" before
testing its kernel to see if it would even boot in the first place.

During the storm of domain create/crash/restart cycles I finally managed
to type the active domain-ID in an "xl destroy" command, only to have
xenstored crash as follows.

FYI, I don't expect to be changing the xenstored binary in the near
future, and I'll keep the core file around in case anyone would like me
to try to fetch any other information from it.

BTW, it would be really cool if the Xen kernel could do something other
than restart a domain that gets into such a create/crash/restart cycle,
say after 10 attempts or some such reasonable limit.

FYI, I'm running Xen-4.5 with a recent-ish NetBSD-current (7.99.5,
2015/02/20) dom0.  "xl info" below.

It would also be really cool if xenstored were finally someday soon be
able to properly periodically checkpoint all of its contents and be able
to restart, reload, and continue on as before after a crash such as
this!


# gdb /usr/pkg/sbin/xenstored /xenstored.core-20150321
GNU gdb (GDB) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/pkg/sbin/xenstored...done.
[New process 1]

warning: Can't read pathname for load map: Unknown error: 4294967295.
Core was generated by `xenstored'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000403d94 in is_child (child=child@entry=0x40e02f 
"@releaseDomain", 
    parent=0x7f7ff7b162c0 <error: Cannot access memory at address 
0x7f7ff7b162c0>) at xenstored_core.c:393
393             unsigned int len = strlen(parent);
(gdb) bt
#0  0x0000000000403d94 in is_child (child=child@entry=0x40e02f 
"@releaseDomain", 
    parent=0x7f7ff7b162c0 <error: Cannot access memory at address 
0x7f7ff7b162c0>) at xenstored_core.c:393
#1  0x0000000000405880 in fire_watches (conn=conn@entry=0x0, 
name=name@entry=0x40e02f "@releaseDomain", 
    recurse=recurse@entry=false) at xenstored_watch.c:101
#2  0x0000000000405e6b in destroy_domain (_domain=_domain@entry=0x7f7ff7b1f710) 
at xenstored_domain.c:207
#3  0x0000000000407bc4 in talloc_free (ptr=ptr@entry=0x7f7ff7b1f710) at 
talloc.c:574
#4  0x0000000000407c0b in talloc_free_children (ptr=0x7f7ff7b12dd0) at 
talloc.c:525
#5  talloc_free (ptr=0x7f7ff7b12dd0) at talloc.c:583
#6  0x0000000000404c3b in consider_message (conn=0x7f7ff7b39cb0) at 
xenstored_core.c:1312
#7  handle_input (conn=conn@entry=0x7f7ff7b39cb0) at xenstored_core.c:1356
#8  0x0000000000405635 in main (argc=<optimized out>, argv=<optimized out>) at 
xenstored_core.c:2127
(gdb) info locals
len = <optimized out>
(gdb) print parent
$1 = 0x7f7ff7b162c0 <error: Cannot access memory at address 0x7f7ff7b162c0>
(gdb) list
388     }
389
390     /* Is child a subnode of parent, or equal? */
391     bool is_child(const char *child, const char *parent)
392     {
393             unsigned int len = strlen(parent);
394
395             /* / should really be "" for this algorithm to work, but that's 
a
396              * usability nightmare. */
397             if (streq(parent, "/"))
(gdb) up
#1  0x0000000000405880 in fire_watches (conn=conn@entry=0x0, 
name=name@entry=0x40e02f "@releaseDomain", 
    recurse=recurse@entry=false) at xenstored_watch.c:101
101                             if (is_child(name, watch->node))
(gdb) print *watch
$2 = {list = {next = 0x7f7ff7b3f6c0, prev = 0x7f7ff7b3f630}, events = {next = 
0x7f7ff7b3f0a0, prev = 0x7f7ff7b3f0a0}, 
  relative_path = 0x7f7ff7b06020 "/local/domain/12", 
  token = 0x7f7ff7b16330 <error: Cannot access memory at address 
0x7f7ff7b16330>, 
  node = 0x7f7ff7b162c0 <error: Cannot access memory at address 0x7f7ff7b162c0>}
(gdb) info locals
i = 0x7f7ff7b39930
watch = 0x7f7ff7b3f090
(gdb) list
96                      return;
97
98              /* Create an event for each watch. */
99              list_for_each_entry(i, &connections, list) {
100                     list_for_each_entry(watch, &i->watches, list) {
101                             if (is_child(name, watch->node))
102                                     add_event(i, watch, name);
103                             else if (recurse && is_child(watch->node, name))
104                                     add_event(i, watch, watch->node);
105                     }
(gdb) 


# xl info
host                   : xenful
release                : 7.99.5
version                : NetBSD 7.99.5 (XEN3_DOM0) #0: Fri Feb 20 18:12:09 PST 
2015  
woods@more:/build/woods/more/current-amd64-amd64-obj/once/rest/work/woods/m-NetBSD-current/sys/arch/amd64/compile/XEN3_DOM0
machine                : amd64
nr_cpus                : 8
max_cpu_id             : 7
nr_nodes               : 1
cores_per_socket       : 4
threads_per_core       : 1
cpu_mhz                : 2826
hw_caps                : 
bfebfbff:20100800:00000000:00000900:000ce3bd:00000000:00000001:00000000
virt_caps              : hvm
total_memory           : 32762
free_memory            : 20358
sharing_freed_memory   : 0
sharing_used_memory    : 0
outstanding_claims     : 0
free_cpus              : 0
xen_major              : 4
xen_minor              : 5
xen_extra              : .0
xen_version            : 4.5.0
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
hvm-3.0-x86_32p hvm-3.0-x86_64 
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : 
xen_commandline        : dom0_mem=2G console=com1 dom0_max_vcpus=1 
dom0_vcpus_pin
cc_compiler            : gcc (nb2 20150115) 4.8.4
cc_compile_by          : root
cc_compile_domain      : .local
cc_compile_date        : Fri Feb 27 22:33:15 PST 2015
xend_config_format     : 4

-- 
                                                Greg A. Woods

+1 250 762-7675                                RoboHack <woods@xxxxxxxxxxx>
Planix, Inc. <woods@xxxxxxxxxx>      Secrets of the Weird <woods@xxxxxxxxx>

Attachment: pgpcoXGvtiavW.pgp
Description: PGP signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.