[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Slow Memory Performance



Hi,

I've run into an interesting wee issue and wondered if anyone else had seen this before?

Running on a Debian Bullseye dom0, with a Debian Bullseye guest I am seeing a really large difference between the memory performance on the dom0 vs the domU.

I have looked at the documentation online about NUMA and tried pinning the cores but no change I have made seems to have made a huge difference. I have searched high and low but couldn't see any obvious documentation about any default memory throughput thresholds? The server I am running on is running an AMD EPYC 7313 Processor with DDR4-3200 Memory, but I suspect the hardware specifications don't matter a whole lot as I see the same domU performance when the guest is running on another server with an AMD EPYC 7302 Processor, again the dom0 performance is a lot better.

Running a `sysbench memory run` on the dom0 gets me the following output:

$ sysbench memory run
sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time


Running memory speed test with the following options:
  block size: 1KiB
  total size: 102400MiB
  operation: write
  scope: global

Initializing worker threads...

Threads started!

Total operations: 66554847 (6654930.25 per second)

64994.97 MiB transferred (6498.96 MiB/sec)


General statistics:
    total time:                          10.0001s
    total number of events:              66554847

Latency (ms):
         min:                                    0.00
         avg:                                    0.00
         max:                                    0.43
         95th percentile:                        0.00
         sum:                                 4074.10

Threads fairness:
    events (avg/stddev):           66554847.0000/0.00
    execution time (avg/stddev):   4.0741/0.00


Running this same command on a domU gets me the following:

$ sysbench memory run
sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time


Running memory speed test with the following options:
  block size: 1KiB
  total size: 102400MiB
  operation: write
  scope: global

Initializing worker threads...

Threads started!

Total operations: 6135308 (613485.30 per second)

5991.51 MiB transferred (599.11 MiB/sec)


General statistics:
    total time:                          10.0001s
    total number of events:              6135308

Latency (ms):
         min:                                    0.00
         avg:                                    0.00
         max:                                    0.27
         95th percentile:                        0.00
         sum:                                 3469.04

Threads fairness:
    events (avg/stddev):           6135308.0000/0.00
    execution time (avg/stddev):   3.4690/0.00

That is quite a difference, so I wondered if I am missing anything obvious here?

An xl info for those interested:

host                   : dom0
release                : 5.10.0-9-amd64
version                : #1 SMP Debian 5.10.70-1 (2021-09-30)
machine                : x86_64
nr_cpus                : 64
max_cpu_id             : 255
nr_nodes               : 2
cores_per_socket       : 16
threads_per_core       : 2
cpu_mhz                : 3000.046
hw_caps                : 178bf3ff:76da320b:2e500800:244037ff:0000000f:219c97a9:0040068c:00000500
virt_caps              : pv hvm hvm_directio pv_directio hap shadow
total_memory           : 1048433
free_memory            : 1026384
sharing_freed_memory   : 0
sharing_used_memory    : 0
outstanding_claims     : 0
free_cpus              : 0
xen_major              : 4
xen_minor              : 14
xen_extra              : .3
xen_version            : 4.14.3
xen_caps               : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit2
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          :
xen_commandline        : placeholder dom0_mem=6144M,max:6144M dom0_max_vcpus=6 dom0_vcpus_pin ucode=scan
cc_compiler            : x86_64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110
cc_compile_by          : pkg-xen-devel
cc_compile_domain      : lists.alioth.debian.org
cc_compile_date        : Mon Sep 13 14:28:21 UTC 2021
build_id               : 1a67c53a8813b422d2033de494f8b444915791f2
xend_config_format     : 4

Any assistance would be greatly appreciated. I have hardware available to run any tests and am eager to resolve this problem if anyone has any pointers or has seen something similar before?

Thanks,

Connor

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.