Re: [Xen-devel] Modules support in Xen (WAS: Re: [ARM] Native application design and discussion (I hope))

[reordering slightly to make the response easier]

On Thu, May 11, 2017 at 7:13 PM, Volodymyr Babchuk
<vlad.babchuk@xxxxxxxxx> wrote:
>> Maybe I'm just not familiar with things, but it's hard for me to imagine
>> why you'd need proprietary blobs to disable cpus or scale frequency.
>> Are these really such complex activities that it's worth investing
>> thousands of hours of developer work into developing proprietary
>> solutions that you license?
> Okay, I don't know no platform where you need proprietary blob to
> scale frequency. And I hope, I never will encounter one.
> But I can imagine it: some firmware binary that needs to be uploaded
> into PMIC. Can we store this firmware in the hypervisor? I don't know.
> I'm not a lawyer.

On x86, we do microcode updates, which are (as I understand it) binary
blobs that get passed through the hypervisor to the cpus.  This blob
isn't executed by Xen, so it doesn't seem like you would be able to
argue that passing a binary blob through the hypervisor creates a
derivative / combined work.  In that case the blobs are stored as
files on disk and passed to Xen at boot time (via grub), not compiled
into the Xen binary.  Whether compiling such things into the binary
constitutes a "derived work" is something you'd probably better ask a
lawyer. :-)

If configuring the bootloader to pass extra files to Xen isn't
suitable on ARM for some reason we can probably come up with some
other way of packaging things together which honors the GPL suitably.

>>>> ...some [things can't be included in hypervisor] because of code
>>>> size or complexity.
>> Sorry, just to be clear: below you mentioned modules as a solution, and
>> given the context this would be included.  So can you expand on what you
>> mean that there are things that 1) can't be included in the hypervisor
>> because of code size or complexity, but for which 2) loadable modules
>> would be a suitable solution?
> Well... Device drives? Emulators? For example, if I will write bunch
> of good and neat GPL drivers for some SoC and I'll promise to maintain
> them, will you include them into upstream?
> Or I will write emulator for some arcane device, will it be merged
> into upstream?
> Real case: I will write OP-TEE mediator for one client and Google
> Trusty mediator for other client. Every will have, say, 2,000 lines of
> code. Are there changes, that they both will be merged into
> hypervisor?


> Anyways, I have taken your point. No proprietary code in modules. What
> about other parts of discussion? Are you against loadable modules in
> any fashion? What about native apps?

There are several different questions we're getting slightly mixed up here:
1. Should some bit of functionality (like a TEE mediator or device
emulation) live in the xen.git tree?
2. Should that functionality run in the hypervisor address space?
3. Should that functionality be loaded via a loadable module?
4. What place to proprietary components have in a Xen system?

Let me address #4 first.  There are lots of examples of proprietary
*components* of Xen systems.  XenClient used to have a proprietary
device model (a process running in dom0) for helping virtualize
graphics cards; a number of companies have proprietary drivers for
memory sharing or VM introspection.  But all of those are outside of
the Xen address space, interacting with Xen via hypercalls.  As long
as "native apps" (I think we probably need a better name here) are
analogous to a devicemodel stubdomain -- in a separate address space
and acting through a well-defined hypercal interface -- I don't have
any objection to having proprietary ones.

Regarding #1-2, let me first say that how specific it is to a
particular platform or use case isn't actually important to any of
these questions.  The considerations are partly technical, and partly
practical -- how much benefit does it give to the project as a whole
vs the cost?

For a long time there were only two functional schedulers in Xen --
the Credit scheduler (now called "credit1" to distinguish it from
"credit2"), and the ARINC653 scheduler, which is a real-time scheduler
targeted at a very specific use case and industry.  As far as I know
there is only one user.  But it was checked into the Xen tree because
it would obviously be useful to them (benefit) and almost no impact on
anyone else (cost); and it ran inside the hypervisor because that's
the only place to run a scheduler.

So given your examples, I see no reason not to have several
implementations of different mediators or emulated devices in tree, or
in a XenProject-managed git repo (like mini-os.git).  I don't know the
particulars about mediators or the devices you have in mind, but if
you can show technical reasons why they need to be run in the
hypervisor rather than somewhere else (for performance or security
sake, for instance), there's no reason in principle not to add them to
the hypervisor code; and if they're in the hypervisor, then they
should be in xen.git.

Regarding modules (#3): The problem that loadable modules were
primarily introduced to solve in Linux wasn't "How to deal with
proprietary drivers", or even "how to deal with out-of-tree drivers".
The problem was, "How to we allow software providers to 1) have a
single kernel binary, which 2) has drivers for all the different
systems on which it needs to run, but 3) not take a massive amount of
memory or space on systems, given that any given system will not need
the vast majority of drivers?"

Suppose hypothetically that we decided that the mediators you describe
need to run in the hypervisor.  As long as Kconfig is sufficient for
people to enable or disable what they need to make a functional and
efficient system, then there's no need to introduce modules.  If we
reached a point where people wanted a single binary that could do
either or OP-TEE mediator or the Google mediator, or both, or neither,
but didn't to include all of them in the core binary (perhaps because
of memory constraints), then loadable modules would be a good solution
to consider.  But either way, if we decided they should run in the
hypervisor, then all things being equal it would still be better to
have both implementations in-tree.

There are a couple of reasons for the push-back on loadable modules.
The first is the extra complication and infrastructure it adds.  But
the second is that people have a strong temptation to use them for
out-of-tree and proprietary code, both of which we'd like to avoid if
possible.  If there comes a point in time where loadable modules are
the only reasonable solution to the problem, I will support having
them; but until that time I will look for other solutions if I can.

Does that make sense?

BTW I've been saying "I" throughout this response; hopefully that
makes it clear that I'm mainly speaking for myself here.


