[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Trying add smt=off disabled cores to cpupool crash xen



Hi Juergen,

Sorry for the late reply.

Here are the commands I execute, it is 'xl cpupool-cpu-add pcores 4-15' that 
crash the system.

xl cpupool-cpu-remove Pool-0 4-31
xl cpupool-create name=\"ecores\" sched=\"credit\"
xl cpupool-cpu-add ecores 16-31

xl cpupool-create name=\"pcores\" sched=\"credit\"
xl cpupool-cpu-add pcores 4-15


Here is the other information you asked for.

xl cpupool-list:
Name               CPUs   Sched     Active   Domain count
Pool-0              24    credit       y          5

xl cpupool-list -c:
Name               CPU list
Pool-0             
0,2,4,6,8,10,12,14,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31

xl info:
host                   : dom0
release                : 6.1.62-1.qubes.fc37.x86_64
version                : #1 SMP PREEMPT_DYNAMIC Tue Nov 14 06:16:38 GMT 2023
machine                : x86_64
nr_cpus                : 24
max_cpu_id             : 31
nr_nodes               : 1
cores_per_socket       : 24
threads_per_core       : 1
cpu_mhz                : 2995.196
hw_caps                : 
bfebfbff:77faf3ff:2c100800:00000121:0000000f:239c27eb:1840078c:00000100
virt_caps              : pv hvm hvm_directio pv_directio hap iommu_hap_pt_share 
vmtrace gnttab-v1
total_memory           : 65373
free_memory            : 56505
sharing_freed_memory   : 0
sharing_used_memory    : 0
outstanding_claims     : 0
free_cpus              : 0
xen_major              : 4
xen_minor              : 17
xen_extra              : .2
xen_version            : 4.17.2
xen_caps               : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p 
hvm-3.0-x86_64 

xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : 

xen_commandline        : placeholder dom0_mem=min:2048M dom0_mem=max:4096M 
ucode=scan gnttab_max_frames=2048 gnttab_max_maptrack_frames=4096 smt=off 
dom0_max_vcpus=4 dom0_vcpus_pin sched-gran=core sched=credit no-real-mode 
edd=off
cc_compiler            : gcc (GCC) 12.3.1 20230508 (Red Hat 12.3.1-1)
cc_compile_by          : mockbuild
cc_compile_domain      : [unknown]
cc_compile_date        : Tue Nov 14 00:00:00 UTC 2023
build_id               : a426597e4f24a9487bed72dafc63d4eb523be22b
xend_config_format     : 4


I'm not sure which file is the cpupool config file, this is the content from 
/etc/xen/cpupool
# the name of the new cpupool
name = "Example-Cpupool"

# the scheduler to use: valid are e.g. credit, credit2 and rtds
sched = "credit"

# list of cpus to use
cpus = ["2", "3"]

/rene


On Monday, December 4th, 2023 at 13:07, Juergen Gross <jgross@xxxxxxxx> wrote:


> On 04.12.23 11:13, Jan Beulich wrote:
> 

> > On 04.12.2023 11:02, Juergen Gross wrote:
> > 

> > > On 04.12.23 10:15, Jan Beulich wrote:
> > > 

> > > > On 01.12.2023 21:12, Andrew Cooper wrote:
> > > > 

> > > > > On 01/12/2023 7:59 pm, René Winther Højgaard wrote:
> > > > > 

> > > > > > If I set smt=off and try to configure cpupools with credit(1) as if
> > > > > > all cores are available, I get the following crash.
> > > > > > 

> > > > > > The crash happens when I try to use xl cpupool-add-cpu on the 
> > > > > > disabled
> > > > > > HT sibling cores.
> > > > > > 

> > > > > > Hyper-threading is enabled in the firmware, and only disabled with
> > > > > > smt=off.
> > > > > 

> > > > > CC'ing some maintainers.
> > > > > 

> > > > > I expect this will also explode when a CPU is runtime offlined with
> > > > > `xen-hptool cpu-offline` and then added to a cpupool.
> > > > > 

> > > > > Interestingly, the crash is mov (%rdx,%rax,1),%r13, and I think that's
> > > > > the percpu posion value in %rdx.
> > > > > 

> > > > > I expect cpupools want to reject parked/offline CPUs.
> > > > 

> > > > While the only explicit check there is
> > > > 

> > > > if ( cpu >= nr_cpu_ids )
> > > > goto addcpu_out;
> > > > 

> > > > I would have expected this
> > > > 

> > > > if ( !cpumask_subset(cpus, &cpupool_free_cpus) ||
> > > > cpumask_intersects(cpus, &cpupool_locked_cpus) )
> > > > goto addcpu_out;
> > > > 

> > > > to deal with the situation, as parked/offline CPUs shouldn't be "free".
> > > > Jürgen?
> > > 

> > > The problem is the call of sched_get_opt_cpumask() to need the percpu area
> > > of the cpu in question.
> > 

> > That was my first thought, too, but then I saw cpupool_assign_cpu_locked() 
> > on
> > the call trace, which is called only afterwards. Plus 
> > sched_get_opt_cpumask()
> > needs the per-CPU area only when granularity was switched from its default 
> > of
> > SCHED_GRAN_cpu afaics.
> 

> 

> Oh right you are.
> 

> My patch is needed for larger granularities, though.
> 

> I've tried to hit the same problem as René, but everything works as intended 
> (no
> crash, but adding an offline cpu is being rejected).
> 

> René, could you please tell us what exactly you've been doing? This would be:
> 

> - Xen command line parameters
> - Output of "xl info"
> - Output of "xl cpupool-list" before starting to manipulate cpupools
> - Output of "xl cpupool-list -c" before starting to manipulate cpupools
> - Cpupool config file used to create new cpupool
> - xl commands you've used to setup the cpupool and adding the cpu(s) to it
> 

> Thanks,
> 

> 

> Juergen

Attachment: publickey - renewin@proton.me - 0x43C32E54.asc
Description: application/pgp-keys

Attachment: signature.asc
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.