[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: Performance Problems, probably network related



Hello!

 

Answering myself to share my findings.

 

1.       The XEN Wiki performance tips are very old, some are still true, some are just wrong.

2.       Bottleneck was CPU, either by having to few CPUs in the dom0, or by not distributing interrupt workload to multiple CPUs

 

Some wiki page mentions that in the domU the network handling is always on vCPU 0. This is not true. Current vif interface is multi-queue. So the interrupt of each queue should be pinned to a dedicated vCPU. this can be done manually, or using irqbalance.

 

The vif in the dom0 and the physical NICs are also mutli queue and interrupts need to be distributed over vCPUs too -> manual pinning or irqbalance.

 

Giving the dom0 enough vCPUs to handle the interrupts.

 

I often found claims that PV is old and slow. The newer PVH is much faster. I can not confirm that. In my testings PV was as fast as PVH.

 

So what I did:

-          stay with PV

-          increased dom0 vCPU from 4 to 16

-          installed irqbalance in domU

-          (irqbalance was already installed in dom0)

 

With this changes, the performance of the name server in the domU increased from 170.000 pps to 850.000 pps.

 

regards

Klaus

 

 

 

Von: Klaus Darilion
Gesendet: Montag, 19. September 2022 23:07
An: 'xen-users@xxxxxxxxxxxxxxxxxxxx' <xen-users@xxxxxxxxxxxxxxxxxxxx>
Betreff: Performance Problems, probably network related

 

Hello!

 

Hardware: 2 servers, hardware is more or less identical

 

Server 1: Ubuntu 20.04 (xen 4.11, Kernel 5.4, Linux Bridge)

AMD EPYC 7702P 64-Core Processor

BCM57416 10G NIC

dom0 has 4 vCPUs

 

Server 2: VMware ESXi 7.0

AMD EPYC 7543P 32-Core Processor

BCM57414 NetXtreme-E 10Gb/25Gb

 

 

VM: Ubuntu 20.04, 8vCPUs. Running Knot DNS name server. I am doing benchmark tests against a VM running either on XEN or VMware.

 

In both cases no tuning (no cpu pinning …).

 

The XEN VM: 170.000 qps

The ESX VM: 575.000 qps

 

So, the XEN VM is much slower than the VMware VM. I thought this is because the XEN VM is "good old" PV. So I repeated the test with type=pvh but the results were the same. I did some more tests:

When I test with a name server which is CPU intensive, then VMware is only a bit faster. But if the workload is more network-heavy (pps), then VMware is much more faster.

 

I have read https://wiki.xenproject.org/wiki/Network_Throughput_and_Performance_Guide but there are so many things and I do not know which of them are still relevant, or where to start.

 

Are there some general advices where to start debugging and tuning (ie are there know network bottlenecks)? Or is XEN known to be slower than VMware in network througput (then I could just stop tuning).

 

Thanks

Klaus

 

 

 

 

 

--

Klaus Darilion, Head of Operations

nic.at GmbH, Jakob-Haringer-Straße 8/V

5020 Salzburg, Austria

 


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.