[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] EL0 app, stubdoms on ARM conf call



Hello Dario,

On 30 June 2017 at 00:26, Dario Faggioli <dario.faggioli@xxxxxxxxxx> wrote:
> On Thu, 2017-06-29 at 22:04 +0300, Volodymyr Babchuk wrote:
>> Hello all,
>>
> Hello,
>
>> 1. OP-TEE use case: DRM playback (secure data path).
>>
>> User wants to play a DRM-protected media file. Rights holders don't
>> want to give user any means to get DRM-free copy of that media file.
>> If you ever heard about Widevine on Android - that it is. Long story
>> short, it is possible to decrypt, decode and display a video frame in
>> a such way, that decrypted data will never be accessible to
>> userspace,
>> kernel or even to hypervisor. This is possible only when all data
>> processing is done in secure mode, which leads us to OP-TEE or
>> (another TEE).
>> So, for each video frame media player should call OP-TEE with
>> encrypted frame data.
>>
>> Good case: 24FPS movie, optimized data path: media player registers
>> shared buffers in OP-TEE only once and then reuses them during every
>> invocation. That would be one OP-TEE call per frame or 24 calls per
>> second.
>> Worst case: High frame rate movie (60 FPS), data path in not
>> optimized. Media player registers shared buffer in OP-TEE, then asks
>> it to process frame, then unregisters buffer. 60 * 3 = 180 calls per
>> second.
>>
>> Сall is done using SMC instruction. Let's assume that OP-TEE mediator
>> lives in Stubdom. There is how call sequence can look like:
>>
>> 1. DomU issues SMC, which is trapped by Hypervisor
>> 2. Hypervisor uses standard approach with ring buffer and event
>> mechanism to call Stubdom. Also it blocks DomU's vCPU which caused
>> this trap.
>> 3a. Stubdom mangles request and asks Hypervisor to issue real SMC
>> (3b. Stubdom mangles request and issues SMC by itself - potentially
>> insecure)
>> 4. After real SMC, Hypervisor returns control back to Stubdom
>> 5. Stubdom mangles return value and returns response to Hypervisor in
>> a ring buffer
>> 6. Hypervisor unblocks DomU's VCPU and schedules it.
>>
>> As you can see, there are 6 context switches
>> (DomU->HYP->Stubdom->HYP->Stubdom->HYP->DomU). There are 2 VCPU
>> switches (DomU->Stubdom->DomU). Both VCPU switches are governed by a
>> scheduler.
>> When I say "governed by scheduler" I imply that there are no
>> guarantees that needed domain will be scheduled right now.
>> This is sequence for one call. As you remember, there can be up to
>> 180
>> such calls per second in this use case. That gives us 180 * 6 ~= 1000
>> context switches per second.
>>
> Ok. This is a quite detailed, well done, and useful description of the
> specific characteristics of your workflow.

> If possible, though, I'd like to know even more. Specifically, on a
> somewhat typical system:
> - how much pCPUs will you have?
Four on our target platform. Probably, we can crank up four A53 cores,
that will gave us 8 pCPUs in total, and will ease up things. But lets
assume that we have 4 pCPUs for now.

> - how much vCPUs will Dom0 have?
Four for now. Probably it won't need so much. I think 2 will be enough.

> - what would Dom0 be doing (as in, what components of your overall
> platform would be running there), and how busy, at least roughly, do
> you expect it would be?
It runs all hardware drivers and all backends (display, input,
network, block, sound)

> - how many vCPUs will DomU have?
It depends on DomU type. Let's say from 2 to 4.

> - how many vCPUs will Stubdom have? (I'm guessing one, as there's only
> 1 OP-TEE, does that make sense?)
Unfortunately, no. OP-TEE is SMP-capable. Every pCPU can call OP-TEE
at the same time. We want to preserve this feature.

> - how many other domains will there be? How many vCPUs will each one of
> them have?
There will be third domain, which runs background jobs. I think, those
are low-priority jobs (Artem can correct me). But they will require
all computational power, that is left.

>
> I understand it's a lot of questions, but it's quite important to have
> these info, IMO. They don't have to be super-precise and totally match
> the final look and setup of your final product, it "just" have to be a
> representative enough example.
>
> I'll try to explain why I think it would be useful to know all these
> things. So, for instance, in the scenario you describe above, if you:
> - have only 1 pCPUs
> - Dom0 has 1 vCPU, and it runs the standard backeds. Which means,
> unless when DomU is doing either disk or network I/O, it's mostly idle
No. It also does composition for all (2 or 3) displays. Also it plays
sound and so on.

> - DomU has 1 vCPU
One of possible DomUs is Android. You know, that Android is very
hungry for resources. I don't expect that it will run smoothly on one
vCPU.

> - Stubdom has 1 vCPU
> - there's no other domain
>
> What I think will happen most of the time will be something like this:
>
> [1]  DomU runs
>      .
>      .
> [2]  DomU calls SMC
>      Xen blocks DomU
>      Xen wakes Stubdom
> [3]  Stubdom runs, does SMC
>      .
>      SMC done, Stubdom blocks
>      Xen wakes DomU
> [4]  DomU runs
>      .
>      .
>
> At [1], Dom0 and Stubdom are idle, and DomU is the only running domain
> (or, to be precise, vCPU), and so it runs. When, at [2], it calls SMC,
> it also blocks. Therefore, at [3], it's Stubdom that is the only
> runnable domain, and in fact, the scheduler let it run. Finally, at
> [4], since Stubdom has blocked again, while DomU has been woken up, the
> only thing the scheduler can do is to run it (DomU).
>
> So, as you say, even with just 1 pCPU available, if the scenario is
> like I described above, there would not be the need for any fancy or
> advanced improvement in the scheduler. Actually, the scheduler does
> very few... It always choose to run the only vCPU that is runnable.
>
> On the other hand if, with still only one pCPU, there are more domains
> (and hence more vCPUs) around, doing other things, and/or, if Dom0 runs
> some other workload, in addition to the backends for DomU, then indeed
> things may get more complicated. For example, at [4], the scheduler may
> choose a different vCPU than the one of DomU, and this would probably
> be a problem.
Yes, this is what we afraid of.

> What I was saying during the call is that we have a lot of tweaks and
> mechanisms already in place to deal with situations like these.
>
> E.g., if you have a decent amount of pCPUs, we can use cpupool, to
> isolate, say, stubdomains from regular DomUs, or to isolate DomU-
> Stubdom couples. Also, something similar, with a smaller degree of
> isolation, but higher flexibility may be achieved with pinning. And,
> finally, we can differentiate the domains among each other, within the
> same pool or pinning mask, by using weights (and, with Credit2,
> starting from 4.10, hopefully, with caps & reservations :-D).
Yes, I was thinking about weights, and how they can help. There some
experiments should be done.

> But to try to envision which one would be the best combination of all
> these mechanisms , I need the information I've asked about above. :-)
Thank you.  Feel free to ask anything you need.

Also, I want to describe high-level setup:

Imagine that we aiming for next-gen car PC/entertainment system. This
system will have at least two displays:
 - instrument cluster (you know, speedometer, odometer and so on)
 - navigation/entertainment display
 - optional display on rear seat
Multiple input devices (touchscreens, hw keys, joysticks and knobs)
Multiple audio input/output devices, GPS, modem, wifi AP, etc...

There can be different setups - with separate driver domain, with
separate instrument cluster domain, etc. But lets stick to the
simplest one:
Dom0 is working with HW and running app for instrument cluster
display. This is highest priority domain
DomU (Android, AGL, or other OS) is running navigation, plays music,
tweets and so on. This is not so prioritized domain, but users are
accustomed that GUI works smoothly :)
DomBack - Runs some background tasks, collects statistics,
communicates with cloud, etc.
DomOP-TEE - stubdom, acts as OP-TEE mediator
DomGPU - stubdom, runs GPU virtualization driver
DomVID - stubdom, runs video decoder/encoder virtualization driver


-- 
WBR Volodymyr Babchuk aka lorc [+380976646013]
mailto: vlad.babchuk@xxxxxxxxx

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.