[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] EL0 app, stubdoms on ARM conf call



On Thu, 2017-06-29 at 22:04 +0300, Volodymyr Babchuk wrote:
> Hello all,
> 
Hello,

> 1. OP-TEE use case: DRM playback (secure data path).
> 
> User wants to play a DRM-protected media file. Rights holders don't
> want to give user any means to get DRM-free copy of that media file.
> If you ever heard about Widevine on Android - that it is. Long story
> short, it is possible to decrypt, decode and display a video frame in
> a such way, that decrypted data will never be accessible to
> userspace,
> kernel or even to hypervisor. This is possible only when all data
> processing is done in secure mode, which leads us to OP-TEE or
> (another TEE).
> So, for each video frame media player should call OP-TEE with
> encrypted frame data.
> 
> Good case: 24FPS movie, optimized data path: media player registers
> shared buffers in OP-TEE only once and then reuses them during every
> invocation. That would be one OP-TEE call per frame or 24 calls per
> second.
> Worst case: High frame rate movie (60 FPS), data path in not
> optimized. Media player registers shared buffer in OP-TEE, then asks
> it to process frame, then unregisters buffer. 60 * 3 = 180 calls per
> second.
> 
> Сall is done using SMC instruction. Let's assume that OP-TEE mediator
> lives in Stubdom. There is how call sequence can look like:
> 
> 1. DomU issues SMC, which is trapped by Hypervisor
> 2. Hypervisor uses standard approach with ring buffer and event
> mechanism to call Stubdom. Also it blocks DomU's vCPU which caused
> this trap.
> 3a. Stubdom mangles request and asks Hypervisor to issue real SMC
> (3b. Stubdom mangles request and issues SMC by itself - potentially
> insecure)
> 4. After real SMC, Hypervisor returns control back to Stubdom
> 5. Stubdom mangles return value and returns response to Hypervisor in
> a ring buffer
> 6. Hypervisor unblocks DomU's VCPU and schedules it.
> 
> As you can see, there are 6 context switches
> (DomU->HYP->Stubdom->HYP->Stubdom->HYP->DomU). There are 2 VCPU
> switches (DomU->Stubdom->DomU). Both VCPU switches are governed by a
> scheduler.
> When I say "governed by scheduler" I imply that there are no
> guarantees that needed domain will be scheduled right now.
> This is sequence for one call. As you remember, there can be up to
> 180
> such calls per second in this use case. That gives us 180 * 6 ~= 1000
> context switches per second.
> 
Ok. This is a quite detailed, well done, and useful description of the
specific characteristics of your workflow.

If possible, though, I'd like to know even more. Specifically, on a
somewhat typical system:
- how much pCPUs will you have?
- how much vCPUs will Dom0 have?
- what would Dom0 be doing (as in, what components of your overall
platform would be running there), and how busy, at least roughly, do
you expect it would be?
- how many vCPUs will DomU have?
- how many vCPUs will Stubdom have? (I'm guessing one, as there's only
1 OP-TEE, does that make sense?)
- how many other domains will there be? How many vCPUs will each one of
them have?

I understand it's a lot of questions, but it's quite important to have
these info, IMO. They don't have to be super-precise and totally match
the final look and setup of your final product, it "just" have to be a
representative enough example.

I'll try to explain why I think it would be useful to know all these
things. So, for instance, in the scenario you describe above, if you:
- have only 1 pCPUs
- Dom0 has 1 vCPU, and it runs the standard backeds. Which means,
unless when DomU is doing either disk or network I/O, it's mostly idle
- DomU has 1 vCPU
- Stubdom has 1 vCPU
- there's no other domain

What I think will happen most of the time will be something like this:

[1]  DomU runs
     .
     .
[2]  DomU calls SMC
     Xen blocks DomU
     Xen wakes Stubdom
[3]  Stubdom runs, does SMC
     .
     SMC done, Stubdom blocks
     Xen wakes DomU
[4]  DomU runs
     .
     .

At [1], Dom0 and Stubdom are idle, and DomU is the only running domain
(or, to be precise, vCPU), and so it runs. When, at [2], it calls SMC,
it also blocks. Therefore, at [3], it's Stubdom that is the only
runnable domain, and in fact, the scheduler let it run. Finally, at
[4], since Stubdom has blocked again, while DomU has been woken up, the
only thing the scheduler can do is to run it (DomU).

So, as you say, even with just 1 pCPU available, if the scenario is
like I described above, there would not be the need for any fancy or
advanced improvement in the scheduler. Actually, the scheduler does
very few... It always choose to run the only vCPU that is runnable.

On the other hand if, with still only one pCPU, there are more domains
(and hence more vCPUs) around, doing other things, and/or, if Dom0 runs
some other workload, in addition to the backends for DomU, then indeed
things may get more complicated. For example, at [4], the scheduler may
choose a different vCPU than the one of DomU, and this would probably
be a problem.

What I was saying during the call is that we have a lot of tweaks and
mechanisms already in place to deal with situations like these.

E.g., if you have a decent amount of pCPUs, we can use cpupool, to
isolate, say, stubdomains from regular DomUs, or to isolate DomU-
Stubdom couples. Also, something similar, with a smaller degree of
isolation, but higher flexibility may be achieved with pinning. And,
finally, we can differentiate the domains among each other, within the
same pool or pinning mask, by using weights (and, with Credit2,
starting from 4.10, hopefully, with caps & reservations :-D).

But to try to envision which one would be the best combination of all
these mechanisms , I need the information I've asked about above. :-)

Thanks and Regards,
Dario

PS. It's a bit late here know... So I'll read the other scenario --the
one about copro-- tomorrow. But I can anticipate that I'm going to ask
the same kind of information :-)
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.