[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Bug: privcmd failure in mmaping 160+ pages



Hi Andres,

Thanks for your reply. However, I encounter some problems in patching
your diff as well as compiling the resulted privcmd.c (after manually
patching).

Which kernel version are you using? Can you provide complete privcmd.c file?

Thanks,
Guanglin

2013/7/31 Andres Lagar-Cavilla <andreslc@xxxxxxxxxxxxxx>:
> Hi Konrad, David and everybody else,
>
> I find a privcmd bug that no more than 160 pages can be mmap by
> IOCTL_PRIVCMD_MMAPBATCH and IOCTL_PRIVCMD_MMAPBATCH_V2.
>
> I have started a thread (see below) at the xen-user list, and then
> discussed with Mr. Ian Campbell, who finally recommended talking to
> the xen-devel list and CC'ing you.
>
> I appreciate your comments.
>
>
> You know, this might have been me. Or in the process of being fixed by a
> patch I just posted.
>
> As you may know by now, xc_map_foreign* first sets up a region of virtual
> memory with an mmap call, and then issues the mapping ioctl to populate the
> region.
>
> Let's say for the sake of argument that the mapping operation is broken up
> into pieces. Current checks in the implementation of PRIVCMD_MMAPBATCH*
> prevent you from populating the memory region piece-meal.
>
> So: can you try the patch I pasted below?
>
> If that doesn't fix it, then you'll need to give us the output of those
> printks.
>
> Thanks
> Andres
>
> [PATCH] Xen: Fix retry calls into PRIVCMD_MMAPBATCH*.
>
> When a foreign mapper attempts to map guest frames that are paged out,
> the mapper receives an ENOENT response and will have to try again
> while a helper process pages the target frame back in.
>
> Gating checks on PRIVCMD_MMAPBATCH* ioctl args were preventing retries
> of mapping calls.
>
> Signed-off-by: Andres Lagar-Cavilla <andres@xxxxxxxxxxxxxxxx>
> ---
> drivers/xen/privcmd.c |   32 +++++++++++++++++++++++++++-----
> 1 files changed, 27 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
> index f8e5dd7..44a26c6 100644
> --- a/drivers/xen/privcmd.c
> +++ b/drivers/xen/privcmd.c
> @@ -43,9 +43,12 @@ MODULE_LICENSE("GPL");
>
> #define PRIV_VMA_LOCKED ((void *)1)
>
> -#ifndef HAVE_ARCH_PRIVCMD_MMAP
> static int privcmd_enforce_singleshot_mapping(struct vm_area_struct *vma);
> -#endif
> +
> +static int privcmd_enforce_singleshot_mapping_granular(
> +               struct vm_area_struct *vma,
> +               unsigned long addr,
> +               unsigned long nr_pages);
>
> static long privcmd_ioctl_hypercall(void __user *udata)
> {
> @@ -422,9 +425,9 @@ static long privcmd_ioctl_mmap_batch(void __user *udata,
> int version)
> vma = find_vma(mm, m.addr);
> if (!vma ||
>     vma->vm_ops != &privcmd_vm_ops ||
> -     (m.addr != vma->vm_start) ||
> -     ((m.addr + (nr_pages << PAGE_SHIFT)) != vma->vm_end) ||
> -     !privcmd_enforce_singleshot_mapping(vma)) {
> +     (m.addr < vma->vm_start) ||
> +     ((m.addr + (nr_pages << PAGE_SHIFT)) > vma->vm_end) ||
> +     !privcmd_enforce_singleshot_mapping_granular(vma, m.addr, nr_pages)) {
> up_write(&mm->mmap_sem);
> ret = -EINVAL;
> goto out;
> @@ -540,11 +543,30 @@ static int privcmd_mmap(struct file *file, struct
> vm_area_struct *vma)
> return 0;
> }
>
> +/* For MMAP */
> static int privcmd_enforce_singleshot_mapping(struct vm_area_struct *vma)
> {
> return !cmpxchg(&vma->vm_private_data, NULL, PRIV_VMA_LOCKED);
> }
>
> +/* For MMAPBATCH*. This allows asserting the singleshot mapping
> + * on a per pfn/pte basis. Mapping calls that fail with ENOENT
> + * can be then retried until success. */
> +static int enforce_singleshot_mapping_fn(pte_t *pte, struct page *pmd_page,
> +                 unsigned long addr, void *data)
> +{
> + return pte_none(*pte) ? 0 : -EBUSY;
> +}
> +
> +static int privcmd_enforce_singleshot_mapping_granular(
> +            struct vm_area_struct *vma,
> +            unsigned long addr,
> +            unsigned long nr_pages)
> +{
> + return apply_to_page_range(vma->vm_mm, addr, nr_pages << PAGE_SHIFT,
> +               enforce_singleshot_mapping_fn, NULL) == 0;
> +}
> +
> const struct file_operations xen_privcmd_fops = {
> .owner = THIS_MODULE,
> .unlocked_ioctl = privcmd_ioctl,
> --
> 1.7.1
>
>
> Thanks,
>
> Guanglin
>
>
> ---------- Forwarded message ----------
> From: Ian Campbell <Ian.Campbell@xxxxxxxxx>
> Date: 2013/7/30
> Subject: Re: [Xen-users] BUG : privcmd mmap 160+ pages
> To: Guanglin Xu <mzguanglin@xxxxxxxxx>
> ??? "xen-users@xxxxxxxxxxxxx" <xen-users@xxxxxxxxxxxxx>
>
>
> On Tue, 2013-07-30 at 07:55 -0400, Guanglin Xu wrote:
>
> 2013/7/30 Ian Campbell <Ian.Campbell@xxxxxxxxx>:
>
> On Mon, 2013-07-29 at 16:16 -0400, Guanglin Xu wrote:
>
> Hi all,
>
> I just find a xc_map_foreign_range() problem in Xen.
>
> xc_map_foreign_range(), an API of libxc backed by privcmd - a xen
> kernel module, can be used to mmap guest VM memory into dom0. However,
> if we mmap more than 160 pages, we'll fail.
>
> Inside xc_map_foreign_range(), it uses ioctrl to communicate with
> privcmd. There are 2 ioctl commands, the one is
> IOCTL_PRIVCMD_MMAPBATCH (legacy), another one is
> IOCTL_PRIVCMD_MMAPBATCH_V2 (newer). Both of them constantly return 0
> (success), but the mmapings are fail after mmaping 160 pages.
>
> Firstly, when my Linux kernel version was 3.5, IOCTL_PRIVCMD_MMAPBATCH
> was the only one ioctl command.  rc = ioctl(fd,
> IOCTL_PRIVCMD_MMAPBATCH, &ioctlx).  After mapping 160 pages, the
> subsequent invoking of ioctl would
> set ioctlx.arr[] items to 140737344202616, overwriting the original
> pfn number.
>
> And then, I updated my Linux kernel to 3.8 so as to test
> IOCTL_PRIVCMD_MMAPBATCH_V2. rc = ioctl(fd,
> IOCTL_PRIVCMD_MMAPBATCH_V2, &ioctlx). This time, the post-160 invoking
> of ioctl would set ioctls.err[] items to EINVAL.
>
> Although I have inserted printk() in privcmd.c to track its execution
> flow, the result showed a look-like complete path, which is quite
> weird. I have no idea what happened in privcmd.
>
> Can anyone figure out this problem?
>
>
> I wouldn't be all that surprised if there was a hardcoded batch size
> limit somewhere in either libxc or the privcmd driver.
>
>
> Hi Ian,
>
> Thank you very much for your reply.
>
> I can confirm that it's the problem of privcmd, because I have debug
> libxc and narrowed the problem to ioctl(), where privcmd would set
> ioctlx.arr[] ( IOCTL_PRIVCMD_MMAPBATCH) or ioctlx.err[] (
> IOCTL_PRIVCMD_MMAPBATCH_V2) to indicate the limitation.
>
>
> If you map you memory in batch of e.g. 128 pages does it all work OK?
>
>
> Yes. For example, [0-127] succeeds. However, the subsequent [128-255]
> would fail because the size of  the whole region has exceeded 160
> pages.
>
>
> That's quite interesting.
>
>
> If you want to get to the bottom of the 160 page limit you'll probably
> have to trace through the code looking for hardcoded sizes or limits on
> sizes (e.g. of arrays) or perhaps integer indexes etc which are too
> small and are overflowing.
>
> 160 * the size of the various structs involved doesn't look to be all
> that interesting (i.e. just over a page size boundary or something), but
> that's the sort of direction I would start by looking in.
>
> If you can't spot it by eye then you'll likely have to instrument the
> code paths with prints to try and track the progress of the initially
> supplied buffer through to the hypercall etc.
>
>
> Yes. I have been debuging libxc and instrumenting privcmd, but I
> couldn't find the "hardcode" or other limitation codes.
>
> Codes in libxc is easizer to trace, but in privcmd it is hard for me
> who is in lack of kernel dev experience. I couldn't even find where
> privcmd copy_to_user() or put_user() the ioctlx.err[] or ioctlx.arr[]
> items while the execution path of privcmd_ioctl() seems quite complete
> by use of printk().
>
> Do you have idea how privcmd can set ioctlx.err[] in another way?
>
>
>
> I'm not familiar with this code. It sounds like this is most probably a
> kernel bug, I'd recommend taking it to the xen-devel list, perhaps CC
> Konrad Wilk and David Vrabel.
>
> Ian.
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.