[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] CPU Lockup bug with the credit2 scheduler


  • To: "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>
  • From: Alastair Browne <alastair.browne@xxxxxxxxxx>
  • Date: Tue, 7 Jan 2020 14:25:57 +0000
  • Accept-language: en-GB, en-US
  • Delivery-date: Tue, 07 Jan 2020 14:27:56 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHVxWZgtKMiGLzDAkyMT4OS9ELgJQ==
  • Thread-topic: CPU Lockup bug with the credit2 scheduler

SYMPTOMS

A Xen host is found to lock up with messages on console along the
following lines:-

NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s!

Later on in the system log, reference is often made to a specific
program that happens to be running at the time, however the program
referred to is not constant and will vary according to what happens to
be running at the time.

Once the host has locked up, the only solution is a reboot. It hasn't
been possible to further analyse the state of a locked up machine due
to unavailability of the command line.

This problem has been seen to occur on a Debian platform with the
following configuration, however it could equally occur on other
platforms.

The configuration of the host machine is as follows:-

# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 8 (jessie)"
NAME="Debian GNU/Linux"
VERSION_ID="8"
VERSION="8 (jessie)"
ID=debian
HOME_URL="http://www.debian.org/";
SUPPORT_URL="http://www.debian.org/support";
BUG_REPORT_URL="https://bugs.debian.org/";

# uname -srvpio
Linux 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2a~test (2019-12-18)
unknown unknown GNU/Linux

# xl info
host                    : my-host.example.com
release                 : 4.9.0-11-amd64
version                 : #1 SMP Debian 4.9.189-3+deb9u2a~test (2019-
12-18)
machine                 : x86_64
nr_cpus                 : 24
max_cpu_id              : 191
nr_nodes                : 2
cores_per_socket        : 12
threads_per_core        : 1
cpu_mhz                 : 1797.920
hw_caps                 :
bfebfbff:77fef3ff:2c100800:00000021:00000001:000037ab:00000000:00000100
virt_caps               : pv hvm hvm_directio pv_directio hap shadow
iommu_hap_pt_share
total_memory            : 392994
free_memory             : 265294
sharing_freed_memory    : 0
sharing_used_memory     : 0
outstanding_claims      : 0
free_cpus               : 0
xen_major               : 4
xen_minor               : 13
xen_extra               : .0-mem1-ox
xen_version             : 4.13.0-mem1-ox
xen_caps                : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler           : credit2
xen_pagesize            : 4096
platform_params         : virt_start=0xffff800000000000
xen_changeset           : Tue Dec 17 14:19:49 2019 +0000 git:a2e84d8e42
xen_commandline         : placeholder dom0_mem=4096M,max:16384M
com1=115200,8n1 console=com1 ucode=scan smt=0 sched=credit2 
crashkernel=512M@32M
cc_compiler             : gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
cc_compile_by           : support
cc_compile_domain       : example.com
cc_compile_date         : Wed Dec 18 11:13:45 GMT 2019
build_id                : 672783467e7a60c4f8a1aa715d549cb59f00c7cf
xend_config_format      : 4Re-Creation


To recreate the symptoms, build a Xen host according to the above
parameters, then create at least ten Linux virtual machines
within it.The Xen host should use LVM to provision the VMs with their
storage. Each VM should have one single disk device, partitioned in
the conventional manner.

The Virtual machines and the Xen host must then be loaded up as
follows:-

VIRTUAL MACHINES

Construct a program to allocate, fill and free
memory. An example of such a program is given below:-

mem-grab.C
/*
  This program will allocate and fill memory. It's purpose is to
  simulate memory use on a machine. Once it has grabbed the memory, it
  sleeps for 10 seconds, then frees it.
  If run with no arguments, the program will find out the maximum
  memory available on the machine and then will attempt to grab 75% of
  it. If run with an integer argument, this program will attempt to
  allocate that amount of memory.
  If an error occurs with the allocation, then an exception will be
  thrown and caught. An error message will then be printed on stderr.
*/
  
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include <thread>
#include <chrono>
#include <mem-size.h>

#define MEM_PERCENT 0.75
  using namespace std;
int main(int argc, char** argv)
{
  int *ptr;
  unsigned long long i,n;
  unsigned long long MemAvailable = 0;
  unsigned long long MemAlloc = 0;
  if (argc == 1)
    {
      // Find out the maximum memory available
      MemAvailable = get_system_memory ();
      cout << "Memory available = " << MemAvailable << endl;
      MemAlloc = MemAvailable * MEM_PERCENT;
      cout << "Memory to be allocated: " << MemAlloc << endl;
      // Divide the value by the size of an int because that's what we
      // will be filling the memory with.
      n = MemAlloc / sizeof (int);
    }
  else
    {
      n = strtoul (argv[1], NULL, 0);
      n = n / sizeof (int);
    }
  cout << "Allocating " << n * sizeof (int) << " bytes..." << endl;
  try
    {
      ptr = new int [n];
    }
  catch (exception& e)
    {
      cerr << "Failed to allocate memory: " << e.what() << endl;
      return 1;
    }
  printf("Filling int into memory.....\n");
  for (i = 0; i < n; i++)
    {
      ptr[i] = 1;
    }
  printf("Sleep 10 seconds......\n");
  this_thread::sleep_for (chrono::seconds (10));
  printf("Free memory.\n");
  free(ptr);
  return 0;
}


mem-size.C

#include <mem-size.h>
unsigned long long get_system_memory ()
{
  unsigned long pages = sysconf(_SC_PHYS_PAGES);
  unsigned long page_size = sysconf(_SC_PAGE_SIZE);
  return pages * page_size;
}

mem-size.h

#ifndef _MEMSIZE_H
#define _MEMSIZE_H
#include <unistd.h>
extern unsigned long long get_system_memory ();
#endif


This program should be compiled as 'mem-grab' and will be controlled
by the following shell script...

#!/bin/bash

# This script should be run with one argument... The filename of a
# file containing a list of the virtual machine names, one per
# line. Each machine name needs to correspond with the the machine
# name as in the 'lvcreate' line below

MachineList=$1
while true
do
    for machine in $(cat ${MachineList})
    do
        date
        lvcreate --size 10G --snapshot \
                 --name test_${machine}_snap \
                 /dev/virtservervg/${machine}_root_fs
    done
    date
    echo "Snapshots Created"
    sleep 2
    # Transfer the snapshot over the network using 'dd', 'gzip' and
'ssh'
    # This requires a passwordless ssh login on the machine specified
by
    # xxx.xxx.xxx.xxx
    for machine in $(cat ${MachineList})
    do
        date
        (dd if=/dev/virtservervg/test_${machine}_snap \
            bs=2048 | gzip -1c | \
             ssh root@xxxxxxxxxxxxxxx \
                 "cat > /dev/null"; \
         echo "${machine} dd finished") &
        echo "Kicked off ${machine}"
        sleep 2
    done
    date
    echo "Machine snapshots kicked off"
    wait
    date
    echo "Snapshot transfers finished"
    for machine in $(cat ${MachineList})
    do
        date
        /sbin/lvremove -f /dev/virtservervg/test_${machine}_snap
    done
    date
    sleep 2
done



Set this script running and then wait for the host to lock up. The
time that this will take is not predictable, however at some point,
the host will experience the CPU soft lockup.Analysis

Work has been done as explained above, with several different kernel
versions and also different versions of the xen_scheduler. It has been
found that the lockups only occur when the 'credit2' version of the
xen_scheduler is being used.

The problem has been found not to occur with the 'credit' version of
xen_scheduler.  It is therefore concluded that there must be a bug
within credit2.

FURTHER NOTES
During the testing, we used 4 hosts running various kernel and Xen
versions.

Kernel Packages

Production: 4.9.0-9-amd64 (4.9.168-1+deb9u3a~test)

Stretch Patched: 4.9.0-11-amd64 (4.9.189-3+deb9u2a)

Buster Unpatched: 4.19.0-0.bpo.6-amd64 (4.19.67-2+deb10u2~bpo9+1)

Buster Patched: 4.19.0-0.bpo.5-amd64

Pre-MDS Patched: 4.9.0-8-amd64 (4.9.110-3+deb9u4a~test)

All of Xen builds were compiled on a suitable machine, using a proven
shell script to do the build.

Xen Packages

Production: 4.12 (Up to xsa-297) (4.12.1-pre-mem3-ox)

Latest: 4.13 (including 3 fixes for credit2) (4.13.0-mem1-ox)

RESULTS

Please see attached spreadsheet


CONCLUSION

So in conclusion, the tests indicate that credit2 might be unstable.

For the time being, we are using credit as the chosen scheduler. We
are booting the kernel with a parameter "sched=credit" to ensure that
the correct scheduler is used.

After the tests, we decided to stick with 4.9.0.9 kernel and 4.12 Xen
for production use running credit1 as the default scheduler.

Attachment: Results.xlsx
Description: Results.xlsx

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.