[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: RFC: arm64: Handling reserved memory nodes
On 20/09/2023 11:03, Leo Yan wrote: On Mon, Sep 18, 2023 at 08:26:21PM +0100, Julien Grall wrote: [...]... from my understanding reserved-memory are just normal memory that are set aside for a specific purpose. So Xen has to create a 'memory' node *and* a 'reserved-memory' region.To be clear, Xen passes the 'reserved-memory' regions as normal memory nodes, see [1]. The memory nodes need to be explicitely written because they are excluded in handle_node(). If a node is not excluded, then it should be created in the dom0 Device-Tree. AFAICT, the 'reserved-memory' node is not excluded and therefore should be copied to the dom0 DT. [...] Here the problem is these reserved memory regions are passed as normal memory nodes to Dom0 kernel, then Dom0 kernel allocates pages from these reserved memory regions. Apparently, this might lead to conflict, e.g. the reserved memory is used by Dom0 kernel, at the meantime the memory is used by another purpose (e.g. by MCU in the system).See above. I think this is correct to pass both 'memory' and 'reserved-memory'. Now, it is possible that Xen may not create the device-tree correctly.Agreed that now Xen wrongly create DT binding for 'reserved-memory' node, more specific, the reserved memory nodes are wrongly passed as normal memory nodes (again, see [1]). See above. You could dump the dom0 Device-Tree to confirm that 'reserved-memory' is created. I would suggest to look how Linux is populating the memory and whether it actually skipped the regions.The Linux kernel reserves the corresponding pages for all reserved memory regions, which means the kernel page management (buddy alrogithm) doesn't allocate these pages at all. With 'no-map' property, the memory range will not be mapped into the kernel identical address space.Here I am a bit confused for "Xen doesn't have the capability to know the memory attribute". I looked into the file arch/arm/guest_walk.c, IIUC, it walks through the stage 1's page tables for the virtual machine and get the permission for the mapping, we also can get to know the mapping attribute, right?Most of the time, Xen will use the HW to translate the guest virtual address to an intermediation physical address. Looking at the specification, it looks like that PAR_EL1 will contain the memory attribute which I didn't know. We would then need to read MAIR_EL1 to find the attribute and also the memory attribute in the stage-2 to figure out the final memory attribute.This is feasible but the Xen ABI mandates that region passed to Xen have a specific memory attributes (see the comment at the top of xen/include/public/arch-arm.h).If you refer to the comment "All memory which is shared with other entities in the system ... which is mapped as Normal Inner Write-Back Outer Write-Back Inner-Shareable", I don't think it's relevant with current issue. I will explain in details in below. It is relevant if you intend to allocate hypercall buffer in a non-reusable reserved-region. Anyway, in your case, Linux is using the buffer is on the stack. So the region must have been mapped with the proper attribute.I think you may misunderstand the issue. I would like to divide the issue into two parts: - The first question is about how to pass reserved memory node from Xen hypervisor to Dom0 Linux kernel. Currently, Xen hypervisor coverts the reserved memory ranges and add them into the normal memory node. Xen hypervisor should keep the reserved memory node and pass it to Dom0 Linux kernel. With this change, the Dom0 kernel will only allocate pages from normal memory node and the data in these pages can be shared by Xen hypervisor and Dom0 Linux kernel. This should be the case. See above. [...] I am under the impression that we have a different meaning for 'using' here. I am referring to the fact that when 'no-map' is specificed, then the kernel cannot use the region for other purpose (e.g. stack).Another question for the attribute for MMIO regions. For mapping MMIO regions, prepare_dtb_hwdom() sets the attribute 'p2m_mmio_direct_c' for the stage 2, but in the Linux kernel the MMIO's attribute can be one of below variants: - ioremap(): device type with nGnRE; - ioremap_np(): device type with nGnRnE (strong-ordered); - ioremap_wc(): normal non-cachable.The stage-2 memory attribute is used to restrict the final memory attribute. In this case, p2m_mmio_direct_c allows the domain to set pretty much any memory attribute.Thanks for confirmation. If so, I think the Xen hypervisor should follow the same attribute to map the reserved regions with attribute p2m_mmio_direct_c.If Xen hypervisor can handle these MMIO types in stage 2, then we should can use the same way to map stage 2 tables for the reserved memory. A difference for the reserved memory is it can be mapped as normal memory with cacheable.I am a bit confused. I read this as you think the region is not mapped in the P2M (aka stage-2 page-tables for Arm). But from the logs you provided, the regions are already mapped (you have an MFN in hand).You are right. The reserved memory regions have been mapped in P2M.So to me the error is most likely in how we create the Device-Tree.Yeah, let's firstly focus on the DT binding for reserved memory nodes.The DT binding is something like (I tweaked a bit for readable):Just to confirm this is the host device tree, right? If so...Yes.memory@20000000 { #address-cells = <0x02>; #size-cells = <0x02>; device_type = "memory"; reg = <0x00 0x20000000 0x00 0xa0000000>, <0x01 0xa0000000 0x01 0x60000000>; };... you can see the reserved-regions are described in the normal memory. In fact...reserved-memory { #address-cells = <0x02>; #size-cells = <0x02>; ranges; reserved_mem1 { reg = <0x00 0x20000000 0x00 0x00010000>; no-map; }; reserved_mem2 { reg = <0x00 0x40000000 0x00 0x20000000>; no-map; }; reserved_mem3 { reg = <0x01 0xa0000000 0x00 0x20000000>; no-map; };... no-map should tell the kernel to not use the memory at all. So I am a bit puzzled why it is trying to use it.No, 'no-map' doesn't mean the Linux kernel doesn't use it, I quote from the kernel documentation So the fact that the stack seemsm to resides in a reserved-region implies that Linux didn't detect the 'no-map'. Cheers, -- Julien Grall
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |