 
	
| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Slow Memory Performance
 Hi, I've run into an interesting wee issue and wondered if anyone else had seen this before? Running on a Debian Bullseye dom0, with a Debian Bullseye guest I am seeing a really large difference between the memory performance on the dom0 vs the domU. I have looked at the documentation online about NUMA and tried pinning the cores but no change I have made seems to have made a huge difference. I have searched high and low but couldn't see any obvious documentation about any default memory throughput thresholds? The server I am running on is running an AMD EPYC 7313 Processor with DDR4-3200 Memory, but I suspect the hardware specifications don't matter a whole lot as I see the same domU performance when the guest is running on another server with an AMD EPYC 7302 Processor, again the dom0 performance is a lot better. Running a `sysbench memory run` on the dom0 gets me the following output: $ sysbench memory run sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3) Running the test with following options: Number of threads: 1 Initializing random number generator from current time Running memory speed test with the following options: block size: 1KiB total size: 102400MiB operation: write scope: global Initializing worker threads... Threads started! Total operations: 66554847 (6654930.25 per second) 64994.97 MiB transferred (6498.96 MiB/sec) General statistics: total time: 10.0001s total number of events: 66554847 Latency (ms): min: 0.00 avg: 0.00 max: 0.43 95th percentile: 0.00 sum: 4074.10 Threads fairness: events (avg/stddev): 66554847.0000/0.00 execution time (avg/stddev): 4.0741/0.00 Running this same command on a domU gets me the following: $ sysbench memory run sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3) Running the test with following options: Number of threads: 1 Initializing random number generator from current time Running memory speed test with the following options: block size: 1KiB total size: 102400MiB operation: write scope: global Initializing worker threads... Threads started! Total operations: 6135308 (613485.30 per second) 5991.51 MiB transferred (599.11 MiB/sec) General statistics: total time: 10.0001s total number of events: 6135308 Latency (ms): min: 0.00 avg: 0.00 max: 0.27 95th percentile: 0.00 sum: 3469.04 Threads fairness: events (avg/stddev): 6135308.0000/0.00 execution time (avg/stddev): 3.4690/0.00 That is quite a difference, so I wondered if I am missing anything obvious here? An xl info for those interested: host : dom0 release : 5.10.0-9-amd64 version : #1 SMP Debian 5.10.70-1 (2021-09-30) machine : x86_64 nr_cpus : 64 max_cpu_id : 255 nr_nodes : 2 cores_per_socket : 16 threads_per_core : 2 cpu_mhz : 3000.046 hw_caps : 178bf3ff:76da320b:2e500800:244037ff:0000000f:219c97a9:0040068c:00000500 virt_caps : pv hvm hvm_directio pv_directio hap shadow total_memory : 1048433 free_memory : 1026384 sharing_freed_memory : 0 sharing_used_memory : 0 outstanding_claims : 0 free_cpus : 0 xen_major : 4 xen_minor : 14 xen_extra : .3 xen_version : 4.14.3 xen_caps : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit2 xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : xen_commandline : placeholder dom0_mem=6144M,max:6144M dom0_max_vcpus=6 dom0_vcpus_pin ucode=scan cc_compiler : x86_64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110 cc_compile_by : pkg-xen-devel cc_compile_domain : lists.alioth.debian.org cc_compile_date : Mon Sep 13 14:28:21 UTC 2021 build_id : 1a67c53a8813b422d2033de494f8b444915791f2 xend_config_format : 4 Any assistance would be greatly appreciated. I have hardware available to run any tests and am eager to resolve this problem if anyone has any pointers or has seen something similar before? Thanks, Connor 
 | 
|  | Lists.xenproject.org is hosted with RackSpace, monitoring our |