[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen
On 02/03/16 10:47, Konrad Rzeszutek Wilk wrote: > > > > > Open: It seems no system call/ioctl is provided by Linux kernel to > > > > > get the physical address from a virtual address. > > > > > /proc/<qemu_pid>/pagemap provides information of mapping from > > > > > VA to PA. Is it an acceptable solution to let QEMU parse this > > > > > file to get the physical address? > > > > > > > > Does it work in a non-root scenario? > > > > > > > > > > Seemingly no, according to Documentation/vm/pagemap.txt in Linux kernel: > > > | Since Linux 4.0 only users with the CAP_SYS_ADMIN capability can get > > > PFNs. > > > | In 4.0 and 4.1 opens by unprivileged fail with -EPERM. Starting from > > > | 4.2 the PFN field is zeroed if the user does not have CAP_SYS_ADMIN. > > > | Reason: information about PFNs helps in exploiting Rowhammer > > > vulnerability. > > Ah right. > > > > > > A possible alternative is to add a new hypercall similar to > > > XEN_DOMCTL_memory_mapping but receiving virtual address as the address > > > parameter and translating to machine address in the hypervisor. > > > > That might work. > > That won't work. > > This is a userspace VMA - which means the once the ioctl is done we swap > to kernel virtual addresses. Now we may know that the prior cr3 has the > userspace virtual address and walk it down - but what if the domain > that is doing this is PVH? (or HVM) - the cr3 of userspace is tucked somewhere > inside the kernel. > > Which means this hypercall would need to know the Linux kernel task structure > to find this. > Thanks for pointing out this. Really it's not a workable solution. > May I propose another solution - an stacking driver (similar to loop). You > setup it up (ioctl /dev/pmem0/guest.img, get some /dev/mapper/guest.img > created). > Then mmap the /dev/mapper/guest.img - all of the operations are the same - > except > it may have an extra ioctl - get_pfns - which would provide the data in > similar > form to pagemap.txt. > I'll have a look at this, thanks! > But folks will then ask - why don't you just use pagemap? Could the pagemap > have an extra security capability check? One that can be set for > QEMU? > Basically for the concern on whether non-root QEMU could work as in Stefano's comments. > > > > > > > > > Open: For a large pmem, mmap(2) is very possible to not map all SPA > > > > > occupied by pmem at the beginning, i.e. QEMU may not be able to > > > > > get all SPA of pmem from buf (in virtual address space) when > > > > > calling XEN_DOMCTL_memory_mapping. > > > > > Can mmap flag MAP_LOCKED or mlock(2) be used to enforce the > > > > > entire pmem being mmaped? > > > > > > > > Ditto > > > > > > > > > > No. If I take the above alternative for the first open, maybe the new > > > hypercall above can inject page faults into dom0 for the unmapped > > > virtual address so as to enforce dom0 Linux to create the page > > > mapping. > > Ugh. That sounds hacky. And you wouldn't neccessarily be safe. > Imagine that the system admin decides to defrag the /dev/pmem filesystem. > Or move the files (disk images) around. If they do that - we may > still have the guest mapped to system addresses which may contain filesystem > metadata now, or a different guest image. We MUST mlock or lock the file > during the duration of the guest. > > So mlocking or locking the mmaped file, or other ways to 'pin' the mmaped file on pmem is a necessity. Thanks, Haozhong _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |