Xen project Mailing List

Re: [PATCH v6 8/8] xen: allow up to 16383 cpus

From: Stefano Stabellini <sstabellini@xxxxxxxxxx>

Date: Tue, 7 May 2024 17:31:00 -0700 (PDT)

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Jürgen Groß <jgross@xxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Wed, 08 May 2024 00:31:36 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, 7 May 2024, Julien Grall wrote: > Hi Stefano, > > On 03/05/2024 20:07, Stefano Stabellini wrote: > > On Fri, 3 May 2024, Julien Grall wrote: > > [...] > > > > So are you saying that from Xen point of view, you are expecting no > > > difference > > > between 256 and 512. And therefore you would be happy if to backport > > > patches > > > if someone find differences (or even security issues) when using > 256 > > > pCPUs? > > > > It is difficult to be sure about anything that it is not regularly > > tested. I am pretty sure someone in the community got Xen running on an > > Ampere, so like you said 192 is a good number. However, that is not > > regularly tested, so we don't have any regression checks in gitlab-ci or > > OSSTest for it. > > > > One approach would be to only support things regularly tested either by > > OSSTest, Gitlab-ci, or also Xen community members. I am not sure what > > would be the highest number with this way of thinking but likely no > > more than 192, probably less. I don't know the CPU core count of the > > biggest ARM machine in OSSTest. > > This would be rochester* (Cavium Thunder-X). They have 96 pCPUs which, IIRC, > are split across two numa nodes. > > > > > Another approach is to support a "sensible" number: not something tested > > but something we believe it should work. No regular testing. (In safety, > > they only believe in things that are actually tested, so this would not > > be OK. But this is security, not safety, just FYI.) With this approach, > > we could round up the number to a limit we think it won't break. If 192 > > works, 256/512 should work? I don't know but couldn't think of something > > that would break going from 192 to 256. > > It depends what you mean by work/break. Strictly speaking, Xen should run > (i.e. not crash). However, it is unclear how well as if you increase the > number of physical CPUs, you will increase contention and may find some > bottleneck. > > I haven't done any performance testing with that many CPUs and I haven't seen > any so far with Xen. But I have some areas of concerns. > > * Xenstored: At least the C version is single-threaded. Technically the limit > here is not based on the number of pCPUs, but as you increase it, you > indirectly increase the number of domains that can run. I doubt it will behave > well if you have 4096 domains running (I am thinking about the x86 limit...). > > * Locking > * How Xen use the locks: I don't think we have many places where we have > global locks (one is the memory subsystem). If a lock is already taken, the > others will spin. It is unclear if we could high contending. > * How Xen implements the locks: At the moment, we are using LL/SC. My take > of XSA-295 is there is a lack of fairness with them. I am not sure what would > happen if they get contented (as we support more pCPUs). It is also probably > time to finally implement LSE atomics. > > * TLB flush: The TLB flush are broadcasted. There are some suggestions on the > Linux ML [1] that they don't perform well on some processors. The discussion > seems to have gone nowhere in Linux. But I think it is propably worth to take > into account when we decide to update the limit we (security) support. > > > > > It depends on how strict we want to be on testing requirements. > From above, I am rather worry about claiming that Xen can supports up to 256 > (and TBH even 192) without any proper testing. This could end up to backfire > as we may need to do (in a rush) and backport some rather large work (unless > we decide to remove support after the fact). I agree with everything you said and I would also add that is not just about backports: if we "support" something it is supposed to mean that we strongly believe it is working. I think we should only make that claim if we test regularly that configuration/feature. > I think I would prefer if we have a low number until someone can do some > testing (including potentially malicious guest). If we want for a > power-of-two, I would go with 128 because this is closer to the HW we have in > testing. If in the future someone can show some data on other platforms (e.g. > Ampere), then we can up the limit. I am OK with that. I wonder if we could use QEMU to add a test for this. > > I am not > > sure what approach was taken by x86 so far. > > It is unclear to me. I don't see how we can claim to support up to 4096 CPUs. > But that's for the x86 folks to decide. Until not long ago, many things were "supported" in many Open Source projects (including Linux, QEMU, etc.) without any automated tests at all. Maybe it is time to revisit this practice.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.