[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Unexpected high dom0 load for bridges, especially with VLAN tag
I've been working on cutting down the number of "little" boxes here and rebuilt a perimeter firewall and interior router/firewall on Xen 4.1.1 running Debian Buster dom0 on an Intel i3-7100T (c. 2017, 2 cores, 4 threads, 3.4 GHz) with a dual-port, Intel PCI NIC (believed genuine) in addition to the onboard NICs. TL;DR I'm seeing ~140-150% dom0 load in xentop when passing ~250 Mbit/s of packets between the two domUs on a dedicated, two-port Open vSwitch bridge. This seems excessive for what should be "just a wire" between the two (other traffic for them is on PCI pass-through of the Intel NICs). Taking out the function of these domUs out of the picture, bringing up two "fresh" two Debian Buster domUs and iperf3 still shows seemingly high load, especially if VLAN tags are involved. This is seen with Open vSwitch or Linux bridges: Without VLAN tag at 300 Mbits/s ~18% dom0 load at 1000 Mbits/s ~40% dom0 load With VLAN tagging/detagging from the domU interfaces at 300 Mbits/s ~ 40% dom0 load at 1000 Mbits/s ~115% dom0 load As there are only two ports on the bridge and two MAC addresses involved, this seems high. No bridge filtering is configured. It is especially surprising that using a single, consistent VLAN tag "on the wire" doubles or triples the load. This is reasonably consistent for Open vSwitch, Linux bridge set up with Debian /etc/network/interfaces config, created with `brctl`, or created with `ip link add ... type bridge` Is this kind of load expected? Is there any configuration of either style bridge that might significantly improve this? (At least for now, I need to stick with tagging the VLAN as I'm trying to unravel why running without the tag causes some throughput problems with the domUs involved.) More detail: xen 4.1.1 ovs 2.10.1Linux xen-i3 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2 (2020-04-29) x86_64 GNU/Linux The interior router uses one of the Intel card's NICs via PCI pass-through to connect to the Cisco SG300 switch for "inside" access to VLAN trunks. It is running FreeBSD 12.1 in HVM mode. The perimeter firewall uses the other of the Intel card's NICs to connect through the Cisco to the DOCSIS modem. The Comcast line is good for ~250 Mbps down and ~10 Mbps up. It is running Debian Buster in PV mode, booted through grub-x86_64-xen.custom.bin (to recognize the ZFS file system on which it runs). The two are connected through a dedicated, two-port Open vSwitch bridge, with the same VLAN tag they were running with when the two functions each had their own, physical hardware. When running a bandwidth test from a local host to a remote server, the outbound packet path, as I understand it is: Interior host sources Interior host sends via Cisco SG300Cisco SG300 forwards to Intel NIC "0" on PCI pass-through to wildside (interior) Wildside processes, routes over VIF pair, tagged Received on other end of pair by dom0 Packet bridged by dom0 Packet goes out another VIF pair to front (perimeter) Front receives packed at other end of VIF pair Front routes packet out Intel NIC "1" on PCI pass-through Cisco SG300 forwards packet to the modem Examining htop on dom0 under load shows truncated names that appear to be queues, three or four associated with each of the two, involved VIFs. No special configuration of kernel governor, CPU affinity, or the like has been done on dom0 or any of the domUs. I've run them both tagged, and was working to cut over untagged on both, but have run into a dribble of throughput when I do. as that involves a non-Linux domU, I'll work through that in another thread. The current xl config has front untagged and wildsdie still tagged. Front (permieter router) vif = ['script=vif-openvswitch,type=vif,vifname=front-zfs_xn0,bridge=ovsbr0:<mgmt VLAN>:<other VLAN>', 'script=vif-openvswitch,type=vif,vifname=front-zfs_xn1,bridge=ovsbr1.<link VLAN>', ] pci = [ '01:00.0', ] Wildside (interior router) vif = ['script=vif-openvswitch,type=vif,vifname=wildside_xn0,bridge=ovsbr1:<link VLAN>', ] pci = [ '01:00.1', ] The VIF names seem to be within the typical 15-character limit. This was previously running on an AMD GX-412TC (4 core, 1 GHz) and a Celeron J1900 (4 core, 2 GHz). The i3-7100T and the NICs on its Intel card have been used to benchmark networking at up to GigE symmetric rates. I've tried to "direct wire" the two domUs with specifying a backend for the VIF in the xl config. Though I am surprised that the VIF pair and a two-port bridge are apparently so CPU hungry, even at low speeds, such a connection would seem to simplify things by removing one VIF pair and the bridge entirely. Even if that were possible, it still leaves me with concerns around using VLAN trunking and its apparent impact on CPU load. This all came about as suricata was the next service I was going to try to move to the Xen box. Thanks! Jeff Kletsky
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |