[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Trying add smt=off disabled cores to cpupool crash xen


  • To: Jan Beulich <jbeulich@xxxxxxxx>, René Winther Højgaard <renewin@xxxxxxxxx>
  • From: Juergen Gross <jgross@xxxxxxxx>
  • Date: Mon, 4 Dec 2023 13:07:45 +0100
  • Authentication-results: smtp-out1.suse.de; none
  • Autocrypt: addr=jgross@xxxxxxxx; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNH0p1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmNvbT7CwHkEEwECACMFAlOMcK8CGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRCw3p3WKL8TL8eZB/9G0juS/kDY9LhEXseh mE9U+iA1VsLhgDqVbsOtZ/S14LRFHczNd/Lqkn7souCSoyWsBs3/wO+OjPvxf7m+Ef+sMtr0 G5lCWEWa9wa0IXx5HRPW/ScL+e4AVUbL7rurYMfwCzco+7TfjhMEOkC+va5gzi1KrErgNRHH kg3PhlnRY0Udyqx++UYkAsN4TQuEhNN32MvN0Np3WlBJOgKcuXpIElmMM5f1BBzJSKBkW0Jc Wy3h2Wy912vHKpPV/Xv7ZwVJ27v7KcuZcErtptDevAljxJtE7aJG6WiBzm+v9EswyWxwMCIO RoVBYuiocc51872tRGywc03xaQydB+9R7BHPzsBNBFOMcBYBCADLMfoA44MwGOB9YT1V4KCy vAfd7E0BTfaAurbG+Olacciz3yd09QOmejFZC6AnoykydyvTFLAWYcSCdISMr88COmmCbJzn sHAogjexXiif6ANUUlHpjxlHCCcELmZUzomNDnEOTxZFeWMTFF9Rf2k2F0Tl4E5kmsNGgtSa aMO0rNZoOEiD/7UfPP3dfh8JCQ1VtUUsQtT1sxos8Eb/HmriJhnaTZ7Hp3jtgTVkV0ybpgFg w6WMaRkrBh17mV0z2ajjmabB7SJxcouSkR0hcpNl4oM74d2/VqoW4BxxxOD1FcNCObCELfIS auZx+XT6s+CE7Qi/c44ibBMR7hyjdzWbABEBAAHCwF8EGAECAAkFAlOMcBYCGwwACgkQsN6d 1ii/Ey9D+Af/WFr3q+bg/8v5tCknCtn92d5lyYTBNt7xgWzDZX8G6/pngzKyWfedArllp0Pn fgIXtMNV+3t8Li1Tg843EXkP7+2+CQ98MB8XvvPLYAfW8nNDV85TyVgWlldNcgdv7nn1Sq8g HwB2BHdIAkYce3hEoDQXt/mKlgEGsLpzJcnLKimtPXQQy9TxUaLBe9PInPd+Ohix0XOlY+Uk QFEx50Ki3rSDl2Zt2tnkNYKUCvTJq7jvOlaPd6d/W0tZqpyy7KVay+K4aMobDsodB3dvEAs6 ScCnh03dDAFgIq5nsB11j3KPKdVoPlfucX2c7kGNH+LUMbzqV6beIENfNexkOfxHfw==
  • Cc: George Dunlap <george.dunlap@xxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Mon, 04 Dec 2023 12:07:58 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.12.23 11:13, Jan Beulich wrote:
On 04.12.2023 11:02, Juergen Gross wrote:
On 04.12.23 10:15, Jan Beulich wrote:
On 01.12.2023 21:12, Andrew Cooper wrote:
On 01/12/2023 7:59 pm, René Winther Højgaard wrote:
If I set smt=off and try to configure cpupools with credit(1) as if
all cores are available, I get the following crash.

The crash happens when I try to use xl cpupool-add-cpu on the disabled
HT sibling cores.

Hyper-threading is enabled in the firmware, and only disabled with
smt=off.

CC'ing some maintainers.

I expect this will also explode when a CPU is runtime offlined with
`xen-hptool cpu-offline` and then added to a cpupool.

Interestingly, the crash is mov (%rdx,%rax,1),%r13, and I think that's
the percpu posion value in %rdx.

I expect cpupools want to reject parked/offline CPUs.

While the only explicit check there is

          if ( cpu >= nr_cpu_ids )
              goto addcpu_out;

I would have expected this

          if ( !cpumask_subset(cpus, &cpupool_free_cpus) ||
               cpumask_intersects(cpus, &cpupool_locked_cpus) )
              goto addcpu_out;

to deal with the situation, as parked/offline CPUs shouldn't be "free".
Jürgen?

The problem is the call of sched_get_opt_cpumask() to need the percpu area
of the cpu in question.

That was my first thought, too, but then I saw cpupool_assign_cpu_locked() on
the call trace, which is called only afterwards. Plus sched_get_opt_cpumask()
needs the per-CPU area only when granularity was switched from its default of
SCHED_GRAN_cpu afaics.

Oh right you are.

My patch is needed for larger granularities, though.

I've tried to hit the same problem as René, but everything works as intended (no
crash, but adding an offline cpu is being rejected).

René, could you please tell us what exactly you've been doing? This would be:

- Xen command line parameters
- Output of "xl info"
- Output of "xl cpupool-list" before starting to manipulate cpupools
- Output of "xl cpupool-list -c" before starting to manipulate cpupools
- Cpupool config file used to create new cpupool
- xl commands you've used to setup the cpupool and adding the cpu(s) to it

Thanks,


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.