[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] cpufreq implementation for OMAP under xen hypervisor.



On Tue, 16 Sep 2014, Oleksandr Dmytryshyn wrote:
> On Wed, Sep 10, 2014 at 10:31 PM, Konrad Rzeszutek Wilk
> <konrad.wilk@xxxxxxxxxx> wrote:
> > On Wed, Sep 10, 2014 at 07:35:47PM +0100, Stefano Stabellini wrote:
> >> On Wed, 10 Sep 2014, Andrii Tseglytskyi wrote:
> >> > Hi,
> >> >
> >> > On Wed, Sep 10, 2014 at 12:42 PM, Ian Campbell <Ian.Campbell@xxxxxxxxxx> 
> >> > wrote:
> >> > >
> >> > > On Tue, 2014-09-09 at 22:41 +0100, Stefano Stabellini wrote:
> >> > > > On Tue, 9 Sep 2014, Ian Campbell wrote:
> >> > > > > On Thu, 2014-09-04 at 22:56 +0100, Stefano Stabellini wrote:
> >> > > > > > I am trying to think of an alternative, such as passing the real 
> >> > > > > > cpu
> >> > > > > > nodes to dom0 but then adding status = "disabled", but I am not 
> >> > > > > > sure
> >> > > > > > whether Linux checks the status for cpu nodes.
> >> > > > >
> >> > > > > status = "disabled" is defined to have a specific (i.e. 
> >> > > > > non-default)
> >> > > > > meaning for cpu nodes, Julien mentioned this when I tried to add a
> >> > > > > similar patch to Xen to ignore them. I think it basically means 
> >> > > > > "present
> >> > > > > but not running, you should start them!".
> >> > > > >
> >> > > > > >  In addition this scheme
> >> > > > > > wouldn't support the case where dom0 has more vcpus than pcpus 
> >> > > > > > on the
> >> > > > > > system. Granted it is not very common and might even be 
> >> > > > > > detrimental for
> >> > > > > > performances, but we should be able to support it.
> >> > > > >
> >> > > > > It's a bit of an edge case, for sure. I guess it wouldn't be 
> >> > > > > totally
> >> > > > > unreasonable to say that if you use this sort of configuration you 
> >> > > > > may
> >> > > > > not get cpufreq support.
> >> > > > >
> >> > > > > > Ian, what do you think about this?
> >> > > > >
> >> > > > > All the options suck in one way or another AFAICT. I think we are 
> >> > > > > going
> >> > > > > to be looking for the least bad solution not necessarily a good 
> >> > > > > one.
> >> > > > >
> >> > > > > Fundamentally are we trying to avoid having to have a i2c 
> >> > > > > subsystem etc
> >> > > > > in the hypervisor to be be able to change the voltages before/after
> >> > > > > changing the frequency?
> >> > > > >
> >> > > > > We can't just say "that's part of the cpufreq driver" since 
> >> > > > > different
> >> > > > > boards using the same SoC might use different voltage regulators, 
> >> > > > > over
> >> > > > > i2c or some other bus etc, so we end up with a matrix.
> >> > > > >
> >> > > > > It's arguable that we should be letting dom0 poke at that regulator
> >> > > > > functionality anyway, at least not all of it. Taking that ability 
> >> > > > > away
> >> > > > > would necessarily imply more platform specific functionality in the
> >> > > > > hypervisor.
> >> > > >
> >> > > > Right.
> >> > > > I am afraid that in order to avoid more code in Xen, we end up with 
> >> > > > an
> >> > > > unmaintainable interface and unupstreamable hacks in dom0.
> >> > >
> >> > > That's what I'm worried about to. Hence I'm wondering if we should just
> >> > > do this in the hypervisor.
> >> > >
> >> > > Although there are a myriad of them the parts used to do voltage 
> >> > > control
> >> > > tend to be fairly simple.
> >> > >
> >> > > One concern I have is that i2c busses also tend to have other things on
> >> > > them which dom0 might legitimately access (e.g. rtc), I'm not sure what
> >> > > to suggest here.
> >> >
> >> > I would try to avoid i2c transactions in Xen. I2C driver is quite
> >> > complicated in Linux kernel. It consists of several parts - common
> >> > core + platform specific. I'm pretty sure Xen should not handle this.
> >> > I think that establishing of event channel for frequency changing is a
> >> > good idea. It would be good to try to implement this. In process of
> >> > implementation we will see what is need to be resolved.
> >>
> >> OK, that's reasonable.
> >>
> >>
> >> > The only question here is how to pass physical cpu to dom0.
> >>
> >> We can use a device tree based interface to pass the information to
> >> dom0, but requiring a number of dom0 vcpus equal to the number of
> >> physical cpus and in addition to that having to pin the vcpus each to a
> >> different pcpu is quite a stringent limitation. However I don't know the
> >> frequency changing interfaces in Linux well enough to know how hard
> >> would be to lift it.
> >>
> >>
> >> > Regarding x86.
> >> > I'm not sure but maybe ACPI interface encapsulate voltage changing as 
> >> > well?
> >>
> >> I think so (but I am not an expert on that).
> >
> > The usual states are P and C states. The P states is the closes to what you
> > are looking at:
> >
> > struct acpi_processor_px {
> >         u64 core_frequency;     /* megahertz */
> >         u64 power;      /* milliWatts */
> >         u64 transition_latency; /* microseconds */
> >         u64 bus_master_latency; /* microseconds */
> >         u64 control;    /* control value */
> >         u64 status;     /* success indicator */
> > };
> >
> >>
> >>
> >>
> >> > Regards,
> >> > Andrii
> >> >
> >> >
> >> > --
> >> >
> >> > Andrii Tseglytskyi | Embedded Dev
> >> > GlobalLogic
> >> > www.globallogic.com
> >> >
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@xxxxxxxxxxxxx
> >> http://lists.xen.org/xen-devel
> 
> 
>                     Cpufreq driver implementation.
>                                  ____________
>                                 /            \
>                                 | xenpm tool |
>                                 \____________/
>  Dom0 kernel user-space
> ---------------------------------------------------------------------------
> 
>                           ________________               _____
>                          /                \             /     \  CPU
>                          | DevTree Parser |          /->| ARM | driver
>                          \________________/          |  \_____/
>  Dom0 kernel                                         |     |
> -----------------------------------------------------|-----|---------------
>                                                      |     |
>               _____________________________________  |     |
>              |     __________        ___________   | |     |
>              |    /          \      /           \  | |     |
>              |    | ondemand |      | userspace |  | |     |
>  Registered  |    \__________/      \___________/  | |     |
>   cpufreq    |   _____________       ___________   | |     |
>  governor    |  /             \     /           \  | |     |
>              |  | performance |     | powersave |  | |     |
>              |  \_____________/     \___________/  | |     |
>              |_____________________________________| |     |
>                                ^                     |     |
>                                |                     |     |
>                          ______|_______              |     |
>                         /              \             |     |  Change
>                         | cpufreq core |-------------/     | frequency
>                         \______________/ set/get freq      |
>                                          commands          |
>  Xen                                                       |
> -----------------------------------------------------------|--------------
>  Hardware                                                __V__
>                                                         |     |
>                                                         | CPU |
>                                                         |_____|
> 
> 
> Description of the implementation:
> Cpufreq core and registered cpufreq governors are located in xen. Dom0
> has CPU driver
> which can only change frequency of the physical CPUs. In addition this driver
> can change CPUs regulator voltage. I'll reuse some ACPI-specific
> variables for ARM.
> Thus I can make minimum modification in the xen cpufreq driver and all 
> utilities
> (as xenpm) will be working without modification if the xen code. In first
> implementation xenpm tool won't show information about C-states, but it can 
> show
> information about P-states and can change cpufreq parameters and
> change governor.
> DevTree parser is a part of the CPU driver in Dom0 and it will read 
> information
> from /cpus/cpu@0/private_data path instead of the original /cpus path.
> 
> Steps of the initialization:
> 1. Xen copies all cpu@xxxxxx@N nodes (from input device tree) with properties 
> to
> /cpus/cpu@0/private_data node (device tree for Dom0). Thus we can have
> any number
> of VCPUs in Dom0 and we give all information about all physical CPUs in
> the private_data node.
> 
> 2. Driver in Dom0 will parse /cpus/cpu@0/private_data path instead of the 
> /cpus
> path and give the information about CPUs parameters to the hypervisor via
> XENPF_set_processor_pminfo hypercall. (Some parameters are calculated in the
> Dom0 driver and can not be calculated  in the hypervisor).
> 
> 3. Cpufreq core driver in the hypervisor will communicate via some interface
> with Dom0 (event channel can be used to notify Dom0) and give some commands
> to the CPU driver in Dom0. Those command are set/get frequency, etc.
> 
> Can I implement cpufreq driver in this way?

The architecture looks sane to me. As Konrad pointed out, the difficulty
here is to be able to upstream the changes to the Linux driver in 2),
that you later in the thread identified as
drivers/cpufreq/cpufreq-cpu0.c.

If the changes are not invasive and you manage to upstream them in
Linux, I am all for this solution.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.