[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH 00/31] CPUFreq on ARM



Hi,

thanks very much for your work on this!

On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
> 
> Hi, all.
> 
> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
> Motivation of hypervisor based CPUFreq is to enable one of the main PM 
> use-cases in virtualized system powered by Xen hypervisor. Rationale behind 
> this activity is that CPU virtualization is done by hypervisor and the guest 
> OS doesn't actually know anything about physical CPUs because it is running 
> on virtual CPUs. It is quite clear that a decision about frequency change 
> should be taken by hypervisor as only it has information about actual CPU 
> load.

Can you please sketch your usage scenario or workloads here? I can think
of quite different scenarios (oversubscribed server vs. partitioning
RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
in the design are quite different between those.

In general I doubt that a hypervisor scheduling vCPUs is in a good
position to make a decision on the proper frequency physical CPUs should
run with. From all I know it's already hard for an OS kernel to make
that call. So I would actually expect that guests provide some input,
for instance by signalling OPP change request up to the hypervisor. This
could then decide to act on it - or not.

> Although these required components (CPUFreq core, governors, etc) already 
> exist in Xen, it is worth to mention that they are ACPI specific. So, a part 
> of the current patch series makes them more generic in order to make possible 
> a CPUFreq usage on architectures without ACPI support in.

Have you looked at how this is used on x86 these days? Can you briefly
describe how this works and it's used there?

> But, the main question we have to answer is about frequency changing 
> interface in virtualized system. The frequency changing interface and all 
> dependent components which needed CPUFreq to be functional on ARM are not 
> present in Xen these days. The list of required components is quite big and 
> may change across different ARM SoC vendors. As an example, the following 
> components are involved in DVFS on Renesas Salvator-X board which has R-Car 
> Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s 
> CPG, PMIC, AVS, THS drivers, i2c support, etc.
> 
> We were considering a few possible approaches of hypervisor based CPUFreqs on 
> ARM and came to conclusion to base this solution on popular at the moment, 
> already upstreamed to Linux, ARM System Control and Power Interface(SCPI) 
> protocol [1]. We chose SCPI protocol instead of newer ARM System Control and 
> Management Interface (SCMI) protocol [2] since it is widely spread in Linux, 
> there are good examples how to use it, the range of capabilities it has is 
> enough for implementing hypervisor based CPUFreq and, what is more, upstream 
> Linux support for SCMI is missed so far, but SCMI could be used as well.
> 
> Briefly speaking, the SCPI protocol is used between the System Control 
> Processor(SCP) and the Application Processors(AP). The mailbox feature 
> provides a mechanism for inter-processor communication between SCP and AP. 
> The main purpose of SCP is to offload different PM related tasks from AP and 
> one of the services that SCP provides is Dynamic voltage and frequency 
> scaling (DVFS), it is what we actually need for CPUFreq. I will describe this 
> approach in details down the text.
> 
> Let me explain a bit more what these possible approaches are:
> 
> 1. “Xen+hwdom” solution.
> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend 
> driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom 
> (possibly dom0) in order to scale physical CPUs. This solution hasn’t been 
> accepted by Xen community yet and seems it is not going to be accepted 
> without taking into the account still unanswered major questions and proving 
> that “all-in-Xen” solution, which Xen community considered as more 
> architecturally cleaner option, would be unworkable in practice.
> The other reasons why we decided not to stick to this approach are complex 
> communication interface between Xen and hwdom: event channel, hypercalls, 
> syscalls, passing CPU info via DT, etc and possible synchronization issues 
> with a proposed solution.
> Although it is worth to mention that the beauty of this approach was that 
> there wouldn’t be a need to port a lot of things to Xen. All frequency 
> changing interface and all dependent components which needed CPUFreq to be 
> functional were already in place.

Stefano, Julien and I were thinking about this: Wouldn't it be possible
to come up with some hardware domain, solely dealing with CPUFreq
changes? This could run a Linux kernel, but no or very little userland.
All its vCPUs would be pinned to pCPUs and would normally not be
scheduled by Xen. If Xen wants to change the frequency, it schedules the
respective vCPU to the right pCPU and passes down the frequency change
request. Sounds a bit involved, though, and probably doesn't solve the
problem where this domain needs to share access to hardware with Dom0
(clocks come to mind).

> Although this approach is not used, still I picked a few already acked 
> patches which made ACPI specific CPUFreq stuff more generic.
> 
> 2. “all-in-Xen” solution.
> This implies that all CPUFreq related stuff should be located in Xen.
> Community considered this solution as more architecturally cleaner option 
> than “Xen+hwdom” one. No layering violation comparing with the previous 
> approach (letting guest OS manage one or more physical CPUs is more of a 
> layering violation).
> This solution looks better, but to be honest, we are not in favor of this 
> solution as well. We expect enormous developing effort to get this support in 
> (the scope of required components looks unreal) and maintain it. So, we 
> decided not to stick to this approach as well.

Yes, I even think it's not feasible to implement this. With a modern
clock implementation there is one driver to control *all* clocks of an
SoC, so you can't single out the CPU clock easily, for instance. One
would probably run into synchronisation issues, at best.

> 3. “Xen+SCP(ARM TF)” solution.
> It is yet another solution based on ARM SCPI protocol. The generic idea here 
> is that there is a firmware, which being a server runs on some dedicated IP 
> core (server), provides different PM services (DVFS, sensors, etc). On the 
> other side there is a CPUFreq driver in Xen, which is running on the AP 
> (client), consumes these services. CPUFreq driver neither changes the CPU 
> frequency/voltage by itself nor cooperates with Linux in order to do such 
> job. It just communicates with SCP directly using SCPI protocol. As I said 
> before, some integrated into a SoC mailbox IP need to be used for IPC 
> (doorbell for triggering action and shared memory region for commands). 
> CPUFreq driver doesn’t even need to know what should be physically changed 
> for the new frequency to take effect. It is a certainly SCP’s responsibility. 
> This all avoid CPUFreq infrastructure in Xen on ARM from diving into each 
> supported SoC internals and as the result having a lot of code.
> 
> The possible issue here could be in SCP, the problem is that some dedicated 
> IP core may be absent at all or performs other than PM tasks. Fortunately, 
> there is a brilliant solution to teach firmware running in the EL3 exception 
> level (ARM TF) to perform SCP functions and use SMC calls for communications 
> [4]. Exactly this transport implementation I want to bring to Xen the first. 
> Such solution is going to be generic across all ARM platforms that do have 
> firmware running in the EL3 exception level and don’t have candidate for 
> being SCP.

While I feel flattered that you like that idea as well ;-), you should
mention that this requires actual firmware providing those services. I
am not sure there is actually *any* implementation of this at the
moment, apart from my PoC code for Allwinner.
And from a Xen point of view I am not sure we are in the position to
force users to use this firmware. This may be feasible in a classic
embedded scenario, where both firmware and software are provided by the
same entity, but that should be clearly noted as a restriction.

> Here we have completely synchronous case because of SMC calls nature. SMC 
> triggered mailbox driver emulates a mailbox which signals transmitted data 
> via Secure Monitor Call (SMC) instruction [5]. The mailbox receiver is 
> implemented in firmware and synchronously returns data when it returns 
> execution to the non-secure world again. This would allow us both to trigger 
> a request and transfer execution to the firmware code in a safe and 
> architected way. Like PSCI requests.
> As you can see this method is free from synchronization issues. What is more, 
> this solution is more architecturally cleaner solution than split model 
> “Xen+hwdom” one. From the security point of view, I hope, everything will be 
> much more correct since the ARM TF, which we want to see in charge of 
> controlling CPU frequency/voltage, is a trusted SW layer. Moreover, ARM TF is 
> responsible for enabling/disabling CPU (PSCI) and nobody complains about it, 
> so let it do DVFS too.

It should be noted that this synchronous nature of the communication can
actually be a problem: a DVFS request usually involves regulator and PLL
changes, which could take some time to settle in. Blocking all of this
time (milliseconds?) in EL3 (probably busy-waiting) might not be desirable.

> I have to admit that I have checked this solution only due to a lack of 
> candidate for being SCP. But, I hope, that other ARM SoCs where dedicated SCP 
> is present (asynchronous case) will work too, but with some limitations. The 
> mailbox IPs for these ARM SoCs must have TX/RX-done irqs. I have described in 
> the corresponding patches why this limitation is present.
> 
> To be honest I have Renesas R-Car Gen3 SoCs in mind as our nearest target, 
> but I would like to make this solution as generic as possible. I don’t treat 
> proposed solution as world-wide generic, but I hope, this solution may be 
> suitable for other ARM SoCs which meet such requirements. Anyway, having 
> something which works, but doesn’t cover all cases is better than having 
> nothing.
> 
> I would like to notice that the patches are POC state and I post them just to 
> illustrate in more detail of what I am talking about. Patch series consist of 
> the following parts:
> 1. GL’s patches which make ACPI specific CPUFreq stuff more generic. Although 
> these patches has been already acked by Xen community and the CPUFreq code 
> base hasn’t changed in a last few years I drop all A-b.
> 2. A bunch of device-tree helpers and macros.
> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC 
> triggered mailbox driver. All components except mailbox driver are in 
> mainline Linux.

Why do you actually need this mailbox framework? Actually I just
proposed the SMC driver the make it fit into the Linux framework. All we
actually need for SCPI is to write a simple command into some memory and
"press a button". I don't see a need to import the whole Linux
framework, especially as our mailbox usage is actually just a corner
case of the mailbox's capability (namely a "single-bit" doorbell).
The SMC use case is trivial to implement, and I believe using the Juno
mailbox is similarly simple, for instance.


So to summarize I think we need to agree on those general questions:
1) Shall the Xen hypervisor actually be involved in CPUFreq at all? Can
this be left to corner-cases like pinned CPUs/guests, where guests
requests are passed on to the hardware?
2) Is EL3/ATF providing SCPI services something we can build on?
Normally I would expect we write drivers to match existing firmware.
3) When we go this way, do we really need to port all of the Linux
drivers and its framework to Xen? Can't we get away with much simpler
solutions? In the end all the SMC mailbox driver does it to trigger an
single SMC call, embedded in a lot of glorious Linux boiler plate code.

What I was *actually* thinking of when using the SMC mailbox approach is
the ability to provide *virtual* SCPI services to guest, in a generic,
not-SoC-specific way. The proposed SMC mailbox binding allows using
*hvc* calls to trigger services, so Xen could pick up DVFS requests from
guests in a generic way and act upon them.

Cheers,
Andre.

> 4. Xen changes to direct ported code for making it compilable. These changes 
> don’t change functionality.
> 5. Some modification to direct ported code which slightly change 
> functionality, I would say to restrict it.
> 6. SCPI based CPUFreq driver and CPUFreq interface component.
> 7. Misc patches mostly to ARM subsystem.
> 8. Patch from Volodymyr Babchuk which adds SMC wrapper.
> 
> Most important TODOs regarding the whole patch series:
> 1. Handle devm in the direct ported code. Currently, in case of any errors 
> previously allocated resources are left unfreed.
> 2. Thermal management integration.
> 3. Don't pass CPUFreq related nodes to dom0. Xen owns SCPI completely.
> 4. Handle CPU_TURBO frequencies if they are supported by HW.
> 
> You can find the whole patch series here:
> repo: https://github.com/otyshchenko1/xen.git branch: cpufreq-devel1
> 
> P.S. There is no need to modify xenpm tool. It works out of the box on ARM.
> 
> [1]
> Linux code:
> http://elixir.free-electrons.com/linux/v4.14-rc6/source/drivers/firmware/arm_scpi.c
> http://elixir.free-electrons.com/linux/v4.14-rc6/source/include/linux/scpi_protocol.h
> http://elixir.free-electrons.com/linux/v4.14-rc6/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
> 
> Recent protocol version:
> http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf
> 
> [2]
> Xen part:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00940.html
> Linux part:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00944.html
> 
> [3]
> http://infocenter.arm.com/help/topic/com.arm.doc.den0056a/DEN0056A_System_Control_and_Management_Interface.pdf
> 
> [4]
> http://linux-sunxi.narkive.com/qYWJqjXU/patch-v2-0-3-mailbox-arm-introduce-smc-triggered-mailbox
> 
> [5]
> http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf
> 
> Oleksandr Dmytryshyn (6):
>   cpufreq: move cpufreq.h file to the xen/include/xen location
>   pm: move processor_perf.h file to the xen/include/xen location
>   pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
>   cpufreq: make turbo settings to be configurable
>   pmstat: make pmstat functions more generalizable
>   cpufreq: make cpufreq driver more generalizable
> 
> Oleksandr Tyshchenko (24):
>   xenpm: Clarify xenpm usage
>   xen/device-tree: Add dt_count_phandle_with_args helper
>   xen/device-tree: Add dt_property_for_each_string macros
>   xen/device-tree: Add dt_property_read_u32_index helper
>   xen/device-tree: Add dt_property_count_elems_of_size helper
>   xen/device-tree: Add dt_property_read_string_helper and friends
>   xen/arm: Add driver_data field to struct device
>   xen/arm: Add DEVICE_MAILBOX device class
>   xen/arm: Store device-tree node per cpu
>   xen/arm: Add ARM System Control and Power Interface (SCPI) protocol
>   xen/arm: Add mailbox infrastructure
>   xen/arm: Introduce ARM SMC based mailbox
>   xen/arm: Add common header file wrappers.h
>   xen/arm: Add rxdone_auto flag to mbox_controller structure
>   xen/arm: Add Xen changes to SCPI protocol
>   xen/arm: Add Xen changes to mailbox infrastructure
>   xen/arm: Add Xen changes to ARM SMC based mailbox
>   xen/arm: Use non-blocking mode for SCPI protocol
>   xen/arm: Don't set txdone_poll flag for ARM SMC mailbox
>   cpufreq: hack: perf->states isn't a real guest handle on ARM
>   xen/arm: Introduce SCPI based CPUFreq driver
>   xen/arm: Introduce CPUFreq Interface component
>   xen/arm: Build CPUFreq components
>   xen/arm: Enable CPUFreq on ARM
> 
> Volodymyr Babchuk (1):
>   arm: add SMC wrapper that is compatible with SMCCC
> 
>  MAINTAINERS                                  |    4 +-
>  tools/misc/xenpm.c                           |    6 +-
>  xen/arch/arm/Kconfig                         |    2 +
>  xen/arch/arm/Makefile                        |    1 +
>  xen/arch/arm/arm32/Makefile                  |    1 +
>  xen/arch/arm/arm32/smc.S                     |   32 +
>  xen/arch/arm/arm64/Makefile                  |    1 +
>  xen/arch/arm/arm64/smc.S                     |   29 +
>  xen/arch/arm/cpufreq/Makefile                |    5 +
>  xen/arch/arm/cpufreq/arm-smc-mailbox.c       |  248 ++++++
>  xen/arch/arm/cpufreq/arm_scpi.c              | 1191 
> ++++++++++++++++++++++++++
>  xen/arch/arm/cpufreq/cpufreq_if.c            |  522 +++++++++++
>  xen/arch/arm/cpufreq/mailbox.c               |  562 ++++++++++++
>  xen/arch/arm/cpufreq/mailbox.h               |   28 +
>  xen/arch/arm/cpufreq/mailbox_client.h        |   69 ++
>  xen/arch/arm/cpufreq/mailbox_controller.h    |  161 ++++
>  xen/arch/arm/cpufreq/scpi_cpufreq.c          |  328 +++++++
>  xen/arch/arm/cpufreq/scpi_protocol.h         |  116 +++
>  xen/arch/arm/cpufreq/wrappers.h              |  239 ++++++
>  xen/arch/arm/smpboot.c                       |    5 +
>  xen/arch/x86/Kconfig                         |    2 +
>  xen/arch/x86/acpi/cpu_idle.c                 |    2 +-
>  xen/arch/x86/acpi/cpufreq/cpufreq.c          |    2 +-
>  xen/arch/x86/acpi/cpufreq/powernow.c         |    2 +-
>  xen/arch/x86/acpi/power.c                    |    2 +-
>  xen/arch/x86/cpu/mwait-idle.c                |    2 +-
>  xen/arch/x86/platform_hypercall.c            |    2 +-
>  xen/common/device_tree.c                     |  124 +++
>  xen/common/sysctl.c                          |    2 +-
>  xen/drivers/Kconfig                          |    2 +
>  xen/drivers/Makefile                         |    1 +
>  xen/drivers/acpi/Makefile                    |    1 -
>  xen/drivers/acpi/pmstat.c                    |  526 ------------
>  xen/drivers/cpufreq/Kconfig                  |    3 +
>  xen/drivers/cpufreq/cpufreq.c                |  102 ++-
>  xen/drivers/cpufreq/cpufreq_misc_governors.c |    2 +-
>  xen/drivers/cpufreq/cpufreq_ondemand.c       |    4 +-
>  xen/drivers/cpufreq/utility.c                |   13 +-
>  xen/drivers/pm/Kconfig                       |    3 +
>  xen/drivers/pm/Makefile                      |    1 +
>  xen/drivers/pm/stat.c                        |  538 ++++++++++++
>  xen/include/acpi/cpufreq/cpufreq.h           |  245 ------
>  xen/include/acpi/cpufreq/processor_perf.h    |   63 --
>  xen/include/asm-arm/device.h                 |    2 +
>  xen/include/asm-arm/processor.h              |    4 +
>  xen/include/public/platform.h                |    1 +
>  xen/include/xen/cpufreq.h                    |  254 ++++++
>  xen/include/xen/device_tree.h                |  158 ++++
>  xen/include/xen/pmstat.h                     |    2 +
>  xen/include/xen/processor_perf.h             |   69 ++
>  50 files changed, 4822 insertions(+), 862 deletions(-)
>  create mode 100644 xen/arch/arm/arm32/smc.S
>  create mode 100644 xen/arch/arm/arm64/smc.S
>  create mode 100644 xen/arch/arm/cpufreq/Makefile
>  create mode 100644 xen/arch/arm/cpufreq/arm-smc-mailbox.c
>  create mode 100644 xen/arch/arm/cpufreq/arm_scpi.c
>  create mode 100644 xen/arch/arm/cpufreq/cpufreq_if.c
>  create mode 100644 xen/arch/arm/cpufreq/mailbox.c
>  create mode 100644 xen/arch/arm/cpufreq/mailbox.h
>  create mode 100644 xen/arch/arm/cpufreq/mailbox_client.h
>  create mode 100644 xen/arch/arm/cpufreq/mailbox_controller.h
>  create mode 100644 xen/arch/arm/cpufreq/scpi_cpufreq.c
>  create mode 100644 xen/arch/arm/cpufreq/scpi_protocol.h
>  create mode 100644 xen/arch/arm/cpufreq/wrappers.h
>  delete mode 100644 xen/drivers/acpi/pmstat.c
>  create mode 100644 xen/drivers/pm/Kconfig
>  create mode 100644 xen/drivers/pm/Makefile
>  create mode 100644 xen/drivers/pm/stat.c
>  delete mode 100644 xen/include/acpi/cpufreq/cpufreq.h
>  delete mode 100644 xen/include/acpi/cpufreq/processor_perf.h
>  create mode 100644 xen/include/xen/cpufreq.h
>  create mode 100644 xen/include/xen/processor_perf.h
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.