[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Minios-devel] [UNIKRAFT PATCHv4 21/43] plat/kvm: Add Arm64 basic entry code





On 18/07/18 08:25, Wei Chen wrote:
Hi Julien,

Hi Wei,

-----Original Message-----
But then why using that for QEMU? You need to compile your ELF assuming

Can I ask you why QEMU supports elf format image? If QEMU support it, why I
can't use it? While I was implementing the Arm64 enablement, elf format is
the simplest way for me to verify my code. I just need to place my code at
the entry point, then QEMU would help to load it to CPU reset entry.

Mostly likely because it was support on x86 and was easy to add support
for Arm.

But as I said earlier, I am not against using ELF. However, there need
to be some documentation telling you how to boot. At the moment, it is
close to zero. So can you write down the expectation?

I think I can write down expectation in another improvement patch series,
but not this series. While I was writing this basic entry code, I didn't
think so much. I just wanted Unikraft to be enabled on Arm64 ASAP, even
this code contains some bugs. Let's open another separate thread and patch
series to improve it.

It will be hard for me to review boot code without knowing the expectations. To be honest, I think it will be very close to the Image boot process. Unless you provide one for ELF, I will base my review on the Image boot process.

[...]

Thanks for your explanation. About the memory attributes, I remember last
year, I asked a question about what will happen will guest and host have
different memory attributes in Linux-eng. I remember the answer is to follow
the more restricted attributes. KVM mapped the memory as cacheable, but
guest disable the cache through system registers. So I think guest memory
is non-cacheable.
Can I understand the "cacheable alias" as "data existed in cache of guest
memory after KVM mapping, but before VM start?"

On Arm64, Linux map all the RAM in its address space. This RAM will be mapped with cacheable attributes. So now, you have two alias (aka mapping) to the same region. One non-cacheable, the other cacheable which means the attributes will be mismatched. While Linux should never access directly through cacheable mapping, the processor is still able to fetch in advance anything in that region.


However, when you will write page-table, you will write with Device
nGnRnE attributes (because MMU is disabled). So the cache will be
bypassed. The cache may still contain stall data that you will hit when
enabling MMU and cache.

To prevent such issue, you need to clean the cache potentially before
and after updating the page-table area. I also mention before because it
looks like the page-table will not be part of the kernel (the region is
not populated), and therefore the cache state is unknown.


Hmm, I understand your concern know. But I have a question. Should we do
such operations on bare metal? I had written code for several bare
metals before. All of these bare metals' MMU are off when reset, I hadn't
do above operations. If so, can I understand that QEMU-KVM and bare metals
have different behaviors for same code? In this case, how can we run
unmodified code on QEMU-KVM?

Per my understanding, you would still need to do such operations on baremetal. The Image boot protocol only tells you the kernel and DTB will be clean to PoC. It will not be invalidated, so you may still have a cache line present of the data you modify.

It does not mean the bootloader will not clean & invalidate the full cache. But that's not mandated by the protocol.



As my understanding, while QEMU is creating a CPU, if it disable the cache
in SCTLR, it would clean the cache, if not it would be a bug of QEMU.

I have no idea how works QEMU without KVM.


How about Xen handle such case?

All the RAM has been cleaned & invalidated to PoC for security reason. But the only thing we can promise is what is been written in the Image protocol. I.e the kernel and DTB has been cleaned to PoC.

Cheers,

--
Julien Grall

_______________________________________________
Minios-devel mailing list
Minios-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/minios-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.