[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Xen 4.18/ARM64 on Raspberry Pi 4B: VLAN traffic crashing Dom0
Thanks for your time, zithro. Am 08.09.2023 um 17:08 schrieb zithro:> First, I need to mention I've never used bridges+VLANs this way, so I may miss the obvious ! > I -think- it's a network problem, not a Xen one, but what do I know 😄I also suspect in the meantime that this is a general (Debian, perhaps even Arm64 specific?) network problem. But I am not sure if it can be ruled out by now that Xen plays a role. > I've often read that bridges on dom0 should have some additional params. > They would be in the iface config, around "bridge_ports", like : > bridge_stp off # dont use STP (spanning tree proto) > bridge_waitport 0 # dont wait for port to be available > bridge_fd 0 # no forward delay Tried that, no change.> You may also try to enable STP (iirc it's disabled by default on Linux bridges). > But TBH, I'm not sure those params will help in this case. Tried that, no change. > I've also read the VLAN 1 is a bit "special", better avoid it.> IIUC, untagged traffic would be auto tagged 1. Use ids 2/3, or 10/11, 10/20, etc. I changed the VLAN numbers. First to 101, 102, 103 etc. This was when I noticed a new strange thing: VLANs with numbers >99 simply don't work on my Raspberry Pi under Debian. VLAN 99 works, VLAN 100 (or everything else >99 that I tried) doesn't work. If I choose a number >99, the VLAN is not configured, "ip a" doesn't list it. Other Debian systems on x64 architecture don't show this behavior, there, it was no problem to set up VLANs > 99. So another data point that there seems to be something fishy about the network on my Raspberry Pi system. Therefore, I've changed the VLANs to 10, 20, 30 etc., which worked. But it didn't solve the initial problem of the crashing Dom0 and DomUs. > Other stuff to test : > - check MAC addressesWhat should I check specifically? (However, if there are duplicate MAC addresses (what I am assuming you are aiming at), why would it work when using the same VLAN bridge?) > - use tcpdump/wireshark remote logging on the real NIC (enabcm6e4ei0) *and* the bridges, to see what really happens, maybe a network/broadcast storm, filling dom0 cpu/memory ? Now, here it becomes really strange. I started tcpdumps on Dom0, and depending on which interface/bridge traffic was logged, the problem went away, meaning, the DomU was running smoothly for hours, even when accessing the zabbix web interface! Stopping the log makes the system crash reproducably if I access the zabbix web interface. Logging enabcm6e4ei0 (NIC): no crashes Logging enabcm6e4ei0.10 (VLAN 10): instant crash Logging enabcm6e4ei0.20 (VLAN 20): no crashes Logging xenbr0 (on VLAN 10): instant crash Logging xenbr1 (on VLAN 20): no crashesI can't think of a rational explanation why logging the traffic on certain interfaces/bridges should avoid the crash of the complete system, while logging other interfaces/bridges doesn't. Any ideas? I checked the dumps of enabcm6e4ei0.10 and xenbr0 (where the system crashes) with wireshark, nothing sticks out to me (but I am really no expert in analyzing network traffic). I could send the dumps directly to you, if you want to spend the time. > - set "loglvl=all" to Xen cmdline to maybe get more info Done, need to check results. (Serial interface is not connected right now.) > - how are the interfaces configured in the domUs and in the cfg files ? /etc/network/interfaces on the DomU on which zabbix is running: # The loopback network interface auto lo iface lo inet loopback # The primary network interface auto enX0 iface enX0 inet static address xx.xx.xx.xx/24 gateway xx.xx.xx.xx iface enX0 inet6 static address xxxx::xxxx:xxxx:xxxx:xxxx/64 gateway xxxx::xxxx:xxxx:xxxx:xxxx # use SLAAC to get global IPv6 address from the router # we may not enable ipv6 forwarding, otherwise SLAAC gets disabled autoconf 1 accept_ra 2 vif line in the xl.cfg of the same DomU: vif = [ 'mac=02:93:0B:61:A5:82,bridge=xenbr1,ip=xx.xx.xx.xx' ] > - test w/o IPv6 Tried that, no difference.> You could also show us the outputs of "ip a", "ip link show type bridge" (brctl show), etc. root@xxx:~# ip a1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host noprefixroute valid_lft forever preferred_lft forever2: enabcm6e4ei0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether d8:3a:dd:28:39:4f brd ff:ff:ff:ff:ff:ff inet6 fe80::da3a:ddff:fe28:394f/64 scope link valid_lft forever preferred_lft forever3: enabcm6e4ei0.10@enabcm6e4ei0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master xenbr0 state UP group default qlen 1000 link/ether d8:3a:dd:28:39:4f brd ff:ff:ff:ff:ff:ff4: enabcm6e4ei0.20@enabcm6e4ei0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master xenbr1 state UP group default qlen 1000 link/ether d8:3a:dd:28:39:4f brd ff:ff:ff:ff:ff:ff inet6 xxxx::xxxx:xxxx:xxxx:xxxx/64 scope global dynamic mngtmpaddr valid_lft 86134sec preferred_lft 14134secinet6 xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx/64 scope global dynamic mngtmpaddr valid_lft 86134sec preferred_lft 14134sec inet6 fe80::da3a:ddff:fe28:394f/64 scope link valid_lft forever preferred_lft forever5: xenbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 02:28:b7:1f:ee:6d brd ff:ff:ff:ff:ff:ff inet xx.xx.xx.xx/24 brd xx.xx.xx.255 scope global xenbr0 valid_lft forever preferred_lft forever inet6 xxxx::xxxx:xxxx:xxxx:xxxx/64 scope global dynamic mngtmpaddr valid_lft 86135sec preferred_lft 14135secinet6 xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx/64 scope global dynamic mngtmpaddr valid_lft 86135sec preferred_lft 14135sec inet6 xxxx::xxxx:xxxx:xxxx:xxxx/64 scope global valid_lft forever preferred_lft forever inet6 fe80::28:b7ff:fe1f:ee6d/64 scope link valid_lft forever preferred_lft forever6: xenbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether c6:11:98:cb:32:bd brd ff:ff:ff:ff:ff:ff inet6 xxxx::xxxx:xxxx:xxxx:xxxx/64 scope global dynamic mngtmpaddr valid_lft 86280sec preferred_lft 14280secinet6 xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx/64 scope global dynamic mngtmpaddr valid_lft 86280sec preferred_lft 14280sec inet6 fe80::c411:98ff:fecb:32bd/64 scope link valid_lft forever preferred_lft forever7: vif1.0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master xenbr0 state UP group default qlen 32 link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff inet6 fe80::fcff:ffff:feff:ffff/64 scope link valid_lft forever preferred_lft forever8: vif2.0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master xenbr1 state UP group default qlen 32 link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff inet6 fe80::fcff:ffff:feff:ffff/64 scope link valid_lft forever preferred_lft forever root@xxx:~# ip link show type bridge5: xenbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 02:28:b7:1f:ee:6d brd ff:ff:ff:ff:ff:ff6: xenbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether c6:11:98:cb:32:bd brd ff:ff:ff:ff:ff:ff root@xxx:~# brctl show bridge name bridge id STP enabled interfaces xenbr0 8000.0228b71fee6d no enabcm6e4ei0.10 vif1.0 xenbr1 8000.c61198cb32bd no enabcm6e4ei0.20 vif2.0> PS: I guess it's only in the mail, and should be harmless, but you have two /eni stanzas "VLAN LAN" and "VLAN DMZ_LAN" that should be comments. > Sorry, copy/paste error, fixed. No difference.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |