[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 11/13] xen: support the Null scheduler



Hi Stefano,

On 02/07/18 23:08, Stefano Stabellini wrote:
On Mon, 2 Jul 2018, Julien Grall wrote:
Hi,

On 02/07/2018 19:24, Stefano Stabellini wrote:
On Mon, 2 Jul 2018, Julien Grall wrote:
Hi Stefano,

On 06/29/2018 07:38 PM, Stefano Stabellini wrote:
On Thu, 28 Jun 2018, Roger Pau Monné wrote:
On Thu, Jun 28, 2018 at 09:27:08AM +0200, Dario Faggioli wrote:
On Thu, 2018-06-14 at 13:20 -0700, Stefano Stabellini wrote:
On Thu, 14 Jun 2018, Andrew Cooper wrote:
On 14/06/18 14:40, Jan Beulich wrote:
I don't think its reasonable to alter the support status with
this
issue
outstanding.

I completely missed this report, probably because I haven't paid
attention to PV-shim. Do you have any more information about this?
The
report is a bit vague. If I can't repro it, I can't fix it.

Couldn't it be that is normal because after a while you ran out of
pcpus?

Dario, do you have any opinion on this?

The issue that I know of is that the null scheduler does not
properly
support CPU hotplug/hotunplug.

This is an issue on, let's say, baremetal, if you use null, and try
to
do CPU hotplug/hotunplug. When trying to use null as the scheduler
of
the shim, we run into that same issue, even if not specifically
doing
CPU hotplug/hotunplug (because the shim use the same path for CPU
bringup, IIRC).

The shim uses CPU hotplug/unplug when the guest brings up/down a
vCPU using the VCPUOP_{up/down} hypercall.

The best description of the issue I could find is:

https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg01085.html

OK, thanks for the explanation. We don't support CPU hotplug on ARM, so
we could mark the NULL scheduler as supported on the ARM architecture
today? Once you implement CPU hotplug support in NULL, we could mark it
as supported on x86 too.
Well, Mirela paved the way to support CPU hotplug (should be merged soon).
She
is looking at suspend/resume which is IHMO an extension of hotplug case.
So
are you sure this could never happen on Arm?

I thought that suspend/resume didn't actually require the same kind of
scheduler support that CPU hotplug needs. If suspend/resume ends up
not working with scheduler NULL, then that is a problem.

The suspend/resume code will offline the CPU one by one using cpu_down. This
is the same path as hotplug. So you will end up with more vCPUs than online
pCPUs, although the domain will be frozen. How this is going to fit in the
NULL scheduler?

[...]

Virtually every platform support CPU hotplug. It is not just about "physically
pluggable CPUs" but any CPU that can be offline at any time.

CPU hotplug in Xen clearly doesn't work as I expected: I assumed that
CPU hotplug would make a CPU "present" or "absent", while cpu_up/down
would make the CPU "online" and "offline". This is how things used to
work in the Linux kernel at least: a CPU can be turned down but still be
present on the socket. To do that, CPU hotplug is not involved. CPU
hotplug would get involved when the user yanks the physical CPU out of
the socket.

Are you sure? Looking at Linux they are using the CPU hotplug subsystem to online/offline CPUs. This is even used to bring up secondary CPUs during boot. This is not very different from how Xen is behaving.


 From what you describe, it is not the case in Xen, and it really looks
like we need support for CPU hotplug in NULL even to support for the
most basic CPU offlining/onlining functionalities.

I think we at least want to have the bug reported by Andrew & Roger fixed. I am not entirely whether there would be other bug in the scheduler.

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.