[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [ARM] Native application design and discussion (I hope)

Hi George,

On 12 May 2017 at 14:48, George Dunlap <george.dunlap@xxxxxxxxxx> wrote:
> [reordering slightly to make the response easier]

>> Okay, I don't know no platform where you need proprietary blob to
>> scale frequency. And I hope, I never will encounter one.
>> But I can imagine it: some firmware binary that needs to be uploaded
>> into PMIC. Can we store this firmware in the hypervisor? I don't know.
>> I'm not a lawyer.
> On x86, we do microcode updates, which are (as I understand it) binary
> blobs that get passed through the hypervisor to the cpus.  This blob
> isn't executed by Xen, so it doesn't seem like you would be able to
> argue that passing a binary blob through the hypervisor creates a
> derivative / combined work.  In that case the blobs are stored as
> files on disk and passed to Xen at boot time (via grub), not compiled
> into the Xen binary.  Whether compiling such things into the binary
> constitutes a "derived work" is something you'd probably better ask a
> lawyer. :-)
Yeah, there are always legal ways to do this.

>>> Sorry, just to be clear: below you mentioned modules as a solution, and
>>> given the context this would be included.  So can you expand on what you
>>> mean that there are things that 1) can't be included in the hypervisor
>>> because of code size or complexity, but for which 2) loadable modules
>>> would be a suitable solution?
>> Well... Device drives? Emulators? For example, if I will write bunch
>> of good and neat GPL drivers for some SoC and I'll promise to maintain
>> them, will you include them into upstream?
>> Or I will write emulator for some arcane device, will it be merged
>> into upstream?
>> Real case: I will write OP-TEE mediator for one client and Google
>> Trusty mediator for other client. Every will have, say, 2,000 lines of
>> code. Are there changes, that they both will be merged into
>> hypervisor?
> [snip]
>> Anyways, I have taken your point. No proprietary code in modules. What
>> about other parts of discussion? Are you against loadable modules in
>> any fashion? What about native apps?
> There are several different questions we're getting slightly mixed up here:
> 1. Should some bit of functionality (like a TEE mediator or device
> emulation) live in the xen.git tree?
> 2. Should that functionality run in the hypervisor address space?
> 3. Should that functionality be loaded via a loadable module?
> 4. What place to proprietary components have in a Xen system?
> Let me address #4 first.  There are lots of examples of proprietary
> *components* of Xen systems.  XenClient used to have a proprietary
> device model (a process running in dom0) for helping virtualize
> graphics cards; a number of companies have proprietary drivers for
> memory sharing or VM introspection.  But all of those are outside of
> the Xen address space, interacting with Xen via hypercalls.  As long
> as "native apps" (I think we probably need a better name here) are
> analogous to a devicemodel stubdomain -- in a separate address space
> and acting through a well-defined hypercal interface -- I don't have
> any objection to having proprietary ones.
Yes, native apps will use almost the same mechanism (actually, it will
be syscalls instead of hypercalls, but basic idea is the same). They
are not linked to a hypervisor in any way.

> Regarding #1-2, let me first say that how specific it is to a
> particular platform or use case isn't actually important to any of
> these questions.  The considerations are partly technical, and partly
> practical -- how much benefit does it give to the project as a whole
> vs the cost?
> For a long time there were only two functional schedulers in Xen --
> the Credit scheduler (now called "credit1" to distinguish it from
> "credit2"), and the ARINC653 scheduler, which is a real-time scheduler
> targeted at a very specific use case and industry.  As far as I know
> there is only one user.  But it was checked into the Xen tree because
> it would obviously be useful to them (benefit) and almost no impact on
> anyone else (cost); and it ran inside the hypervisor because that's
> the only place to run a scheduler.
> So given your examples, I see no reason not to have several
> implementations of different mediators or emulated devices in tree, or
> in a XenProject-managed git repo (like mini-os.git).  I don't know the
> particulars about mediators or the devices you have in mind, but if
> you can show technical reasons why they need to be run in the
> hypervisor rather than somewhere else (for performance or security
> sake, for instance), there's no reason in principle not to add them to
> the hypervisor code; and if they're in the hypervisor, then they
> should be in xen.git.
This is question that bothered me. Thank you for clarification. Going
to specific use cases, yes, there are reasons why OP-TEE mediator
should run in hypervisor (or in a very privileged app).

> Regarding modules (#3): The problem that loadable modules were
> primarily introduced to solve in Linux wasn't "How to deal with
> proprietary drivers", or even "how to deal with out-of-tree drivers".
> The problem was, "How to we allow software providers to 1) have a
> single kernel binary, which 2) has drivers for all the different
> systems on which it needs to run, but 3) not take a massive amount of
> memory or space on systems, given that any given system will not need
> the vast majority of drivers?"
> Suppose hypothetically that we decided that the mediators you describe
> need to run in the hypervisor.  As long as Kconfig is sufficient for
> people to enable or disable what they need to make a functional and
> efficient system, then there's no need to introduce modules.  If we
> reached a point where people wanted a single binary that could do
> either or OP-TEE mediator or the Google mediator, or both, or neither,
> but didn't to include all of them in the core binary (perhaps because
> of memory constraints), then loadable modules would be a good solution
> to consider.  But either way, if we decided they should run in the
> hypervisor, then all things being equal it would still be better to
> have both implementations in-tree.
> There are a couple of reasons for the push-back on loadable modules.
> The first is the extra complication and infrastructure it adds.  But
> the second is that people have a strong temptation to use them for
> out-of-tree and proprietary code, both of which we'd like to avoid if
> possible.  If there comes a point in time where loadable modules are
> the only reasonable solution to the problem, I will support having
> them; but until that time I will look for other solutions if I can.
> Does that make sense?
Yes, thank you. Legal questions is not my best side. Looks like I was
too quick, when proposed modules as a solution to our needs. Sorry, I
had to investigate this topic further before talking about it.

So, let's get back to native apps. We had internal discussion about
possible use cases and want to share our conclusions.

1. Emulators. As Stefano pointed, this is ideal use case for small,
fast native apps that are accounted in a calling vcpu time slice.

2. Virtual coprocessor backend/driver. The part that does actual job:
makes coprocessor to save or restore context. It is also small,
straightforward app, but it should have access to a real HW.

3. TEE mediators. They need so much privileges, so there actually are
no sense in putting them into native apps. For example, to work
properly OP-TEE mediator needs to: pin guest pages, map guest pages to
perform IPA->MPA translation, send vIRQs to guests, issue real SMCs.

4. Any other uses?

So, as you can see, emulator have no privileges at all and can be
domain-bound (e.g. one emulator instance per guest). vcoproc driver
needs privileges to work with certain MMIOs (and possibly, IRQs). TEE
mediator is actually should work at EL2 level, there are no benefits
in putting it into EL0 app.

If there are no objections, I propose to put TEE topic aside for now.
Just to be clear: I really like your idea to put TEE mediators into
hypervisor tree and use Kconfig to choose needed one.

So, there are emulators and vcoproc drivers left. Emulator is a quite
simple thing: it should handle MMIO read/write and issue vIRQ
sometimes. Also it should be configured somehow. As Stefano said, we
should have multiple instances of the same emulator. One for each

vcoproc driver should be able to work with real HW, probably it will
handle real IRQs from device, we need one instance of driver per
device, not per domain. Andrii can correct me, but vcoproc framework
is not tied to vcpus, so it can work in context of any vcpu. Thus, it
will be accounted for that vcpu, that happened to execute at current
moment. Probably, this is not fair.

Can we run vcoproc driver in a stubdomain? Probably yes, if we can
guarantee latency (as if in real time system). Just one example: 60FPS
displays are standard at this time. 1/60 gives us 16ms to deliver
frame to a display. 16ms should be enough to render next frame,
compose it, deliver to a display controller. Actually it is plenty of
time (most of the time). Now imagine that we want to share one GPU
between two domains. Actual render tasks can be very fast, lets say 1
ms for each domain. But to render both of them, we need to switch GPU
context at least two times (one time to render Guest A task, one time
to render Guest B task). This gives us 8ms between switches. If we
will put vcoproc driver to a stubdomain, we will be at mercy of vCPU
scheduler. It is good scheduler, but I don't know if it suits for this
use case. 8ms is an upper bound. If there will be three domains
sharing GPU, limit will be 6 ms. And, actually, one slice per domain
is not enough, because domain may be willing to render own portion
later. So, 1 ms will be more realistic requirement. I mean, that
stubdom with coproc driver should be scheduled every 1ms not matter of
With native apps (or some light stubdomain) which will be scheduled
right when it is needed - this is much easier task.

At least, this is my vision of vcoproc driver problem. Andrii can
correct me, if I'm terribly wrong.

> BTW I've been saying "I" throughout this response; hopefully that
> makes it clear that I'm mainly speaking for myself here.
Yeah, I understand this.

WBR Volodymyr Babchuk aka lorc [+380976646013]
mailto: vlad.babchuk@xxxxxxxxx

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.