[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Dom0 crashed when rebooting whilst DomU are running



On Sep 12, 2012, at 10:13 AM, Ian Campbell wrote:

> On Tue, 2012-09-11 at 23:46 +0100, Maik Brauer wrote:
>>>>>> I found out that it hangs during re-boot of dom0 when having more
>>>>>> Network interfaces involved, like:
>>>>>>    vif = [ 'mac=06:46:AB:CC:11:01, ip=<myIPadress>', '', '',
>>>>>> 'mac=06:04:AB:BB:11:03, bridge=VLAN20, script=vif-bridge', '',
>>>>>> 'mac=06:04:AB:BB:11:05, bridge=VLAN40, script=vif-bridge' ]
>>>>> 
>>>>> 6 interfaces total, 3 of which have a random mac on each reboot and all
>>>>> get put on the default bridge?
>>>> 
>>>> No, not really. The bridge is different for each interface.
>>> 
>>> You have three lots of '' which will all go onto the same bridge AFAICT
>>> (whichever one is determined to be the default)
>> 
>> That is right. As long as I put nothing inside that it should be a
>> different script to execute, it will use default for ''
> 
> The default is "vif-bridge". Have you changed the default?

No I did not change this default script. Everything is at is has been delivered 
in the Xen Source package.
> 
> If not then your configuration as shown will put three interfaces on the
> *same* bridge. Is this really what you want?

No, because it will "not" put everything on the same bridge, because the 
default setting is "routed mode" due to the fact that
my providers network configuration has changed the routing. Therefore in 
xend-config.sxp we have the following disabled:
#(network-script network-route)
#(vif-script     vif-route)

and the next one enabled:
(network-script network-route)
(vif-script     vif-route)

So basically you can see them as placeholder for the eth1, eth2 and eth4 
devices in the Guest domU.
For the other 2 interfaces it is different. They should be bridged (different 
from default). Therefore I have to
put the "script=vif-bridge" in the config as shown above. See below the output 
of brctl show:
bridge name     bridge id               STP enabled     interfaces
VLAN11          8000.000000000000       no              
VLAN12          8000.000000000000       no              
VLAN20          8000.feffffffffff       no              vif2.3
VLAN30          8000.000000000000       no              
VLAN40          8000.feffffffffff       no              vif2.5

> 
> You claim above that the bridge is different for each interface, but
> unless you have changed something somewhere then this is not the case.
> Since you are having problems it is important to identify everything
> which you have changed from the defaults.

No, I am saying that the bridge name is different. Not that the script is 
different. I am just creating isolated bridges
VLAN20, VLAN30, VLAN40, and so on in order to connect special network 
interfaces together from different domU's.
> 
>>>> List is empty. SysRQ -w and SysRQ-t shows nothing at all.
>>> 
>>> You might need to increase the log verbosity with SysRQ-9 first?
>> 
>> I did and now I got more Information. But due to the amount of data which 
>> slips over the console screen I am not able
>> to record properly. Can you advice what to do here?
> 
> Like I said "that list can be quite long so it is useless
> without a serial console": http://wiki.xen.org/wiki/Xen_Serial_Console

This will be a challenge.
> 
> Depending on your distro you might also find this info in the logs
> under /var/log somewhere.
> 
There is not really useful information available. It will not print the info we 
need. I checked it already.
>>> 
>>>> There is nothing running anymore.
>>>> It shows periodically:  INFO: task xenwatch:12 blocked for more than 120 
>>>> seconds
>>> 
>>> What is the very last thing printed before this?
>> 
>> There is nothing before.
> 
> So the output is silent from boot until this message comes up? That
> seems unlikely, since there should be plenty of messages from the
> shutdown process itself if nothing else.

Yes there a plenty of messages. Let me put some lines below:
Stopping NFS Daemon
Stopping portmap daemon.
Deconfiguring network interfaces
Listening on LPF/eth0/00:1c:42:77:7a:29
Sending on LPF/eth0/00:1c:42:77:7a:29
DHCPRELEASE on eth0 
Cleaning up ifdown.
Saving system clock.
Deactivating  swap.
Will now restart.
[  720.213710] INFO: task xenwatch:12 blocked for more than 120 seconds
[  720.213.753] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message
[  840.212.745] INFO: task reboot:3347 blocked for more than 120 seconds
[  840.212.785] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message

(last INFO messages will repeat infinitely)

> 
> What is the last message one the screen before this one? In fact what is
> the entire last screenfull of output?
> 
See above.

>>> Really the initscript ought to wait, the default at least with the
>>> script shipped with xen is to do so, by using shutdown --wait. can you
>>> confirm whether or not this is happening for you?
>> 
>> At least I can see that the shutdown --wait is in the scripts. So it seems 
>> that the init script is waiting.
>> But independent from that, something must be still in use. Which block the 
>> reboot process.
>>> 
>>> Possibly someone is trying to talk to xenstore after xenstored has
>>> exited -- I expect that would cause the sorts of blocked for 120
>>> messages you are seeing.
>>> 
>> Could be, but we need to find out what is blocking the shutdown. I do not 
>> know what else I can do in order to measure and collect
>> data for investigation.
> 
> Did you add debugging to the hotplug scripts like I suggested a couple
> of mails back?

No I didn't up to now.
> 
> If you run the xendomains script by hand and then *immediately* after it
> exits run "xl list" have the domains actually gone? You could even stick
> some calls to xl list into the script itself and verify that the domains
> are indeed shutting down as expected.

root@xenserver:/etc/xen/scripts# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0   880     1     r-----     82.4
dnssrv01-v6                                  3   128     1     -b----      4.5
root@xenserver:/etc/xen/scripts# /etc/init.d/xendomains stop
Shutting down Xen domains: dnssrv01-v6(save)...
[done].
root@xenserver:/etc/xen/scripts# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0   880     1     r-----     89.4
root@xenserver:/etc/xen/scripts# 
> 
> BTW Are you using xl or xend?

I am using xend.
> 
> Ian.
> 
> 
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxx
> http://lists.xen.org/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.