[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] ARM fixes for my improved privcmd patch.

To: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
From: Mats Petersson <mats.petersson@xxxxxxxxxx>
Date: Wed, 19 Dec 2012 12:10:37 +0000
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Delivery-date: Wed, 19 Dec 2012 12:10:57 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

New patch attached.

I haven't done the relevant spelling fixes etc, as I had a little mishapwith git, and need to fix up a few things. Thought you may want to havea look over the ARM side meanwhile, and if this is OK, I will post an"official" V5 patch to the list.


Further comments below.

--
Mats
On 19/12/12 10:59, Ian Campbell wrote:

On Tue, 2012-12-18 at 19:34 +0000, Mats Petersson wrote:

I think I'll do the minimal patch first, then, if I find some spare
time, work on the "batching" variant.

OK. The batching is IMHO just using the multicall variant.

The XENMEM_add_to_physmap_range is itself batched (it contains ranges of
mfns etc), so we don't just want to batch the actual hypercalls we
really want to make sure each hypercall processes a batch, this will
lets us optimise the flushes in the hypervisor.

I don't know if they mutlicall infrastructure allows for that but it
would be pretty trivial to do it explicitly

Yes, I'm sure it is. I would prefer to do that AFTER my x86 patch hasgone in, if that's possible. (Or that someone who can actuallyunderstands the ARM architecture and can test it on actual ARM does it...)


I expect these patches will need to be folded into one to avoid a
bisecability hazard?

Yes, that is certainly my plan. I just made two patches for ease of"what is ARM and what isn't" - but final submit should be as one patch.


xen-privcmd-remap-array-arm.patch:

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 7a32976..dc50a53 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -73,7 +73,7 @@ static int map_foreign_page(unsigned long lpfn, unsigned long 
fgmfn,
  }

struct remap_data {

-       xen_pfn_t fgmfn; /* foreign domain's gmfn */
+       xen_pfn_t *fgmfn; /* foreign domain's gmfn */
         pgprot_t prot;
         domid_t  domid;
         struct vm_area_struct *vma;
@@ -90,38 +90,120 @@ static int remap_pte_fn(pte_t *ptep, pgtable_t token, 
unsigned long addr,
         unsigned long pfn = page_to_pfn(page);
         pte_t pte = pfn_pte(pfn, info->prot);

- if (map_foreign_page(pfn, info->fgmfn, info->domid))

+       // TODO: We should really batch these updates
+       if (map_foreign_page(pfn, *info->fgmfn, info->domid))
                 return -EFAULT;
         set_pte_at(info->vma->vm_mm, addr, ptep, pte);
+       info->fgmfn++;

Looks good.

         return 0;
  }

-int xen_remap_domain_mfn_range(struct vm_area_struct *vma,

-                              unsigned long addr,
-                              xen_pfn_t mfn, int nr,
-                              pgprot_t prot, unsigned domid,
-                              struct page **pages)
+/* do_remap_mfn() - helper function to map foreign pages
+ * @vma:     the vma for the pages to be mapped into
+ * @addr:    the address at which to map the pages
+ * @mfn:     pointer to array of MFNs to map
+ * @nr:      the number entries in the MFN array
+ * @prot:    page protection mask
+ * @domid:   id of the domain that we are mapping from
+ * @pages:   page information (for PVH, not used currently)
+ *
+ * This function takes an array of mfns and maps nr pages from that into
+ * this kernel's memory. The owner of the pages is defined by domid. Where the
+ * pages are mapped is determined by addr, and vma is used for "accounting" of
+ * the pages.
+ *
+ * Return value is zero for success, negative for failure.
+ */
+static int do_remap_mfn(struct vm_area_struct *vma,
+                       unsigned long addr,
+                       xen_pfn_t *mfn, int nr,
+                       pgprot_t prot, unsigned domid,
+                       struct page **pages)

Since xen_remap_domain_mfn_range isn't implemented on ARM the only
caller is xen_remap_domain_mfn_array so you might as well just call this
function ..._array.

Yes, could do. As I stated in the commentary text, I tried to keep thecode similar in structure to x86.[Actually, one iteration of my code had two API functions, one of whichcalled the other, but it was considered a better solution to make onecommon function, and have the two x86 functions call that one].

  {
         int err;
         struct remap_data data;

- /* TBD: Batching, current sole caller only does page at a time */

-       if (nr > 1)
-               return -EINVAL;
+       BUG_ON(!((vma->vm_flags & (VM_PFNMAP | VM_IO)) == (VM_PFNMAP | VM_IO)));

Where does this restriction come from?

I think it is true of X86 PV MMU (which has certain requirements about
how and when PTEs are changed) but I don't think ARM or PVH need it
because they use two level paging so the PTEs are just normal native
ones with no extra restrictions.

Maybe it is useful to enforce similarity between PV and PVH/ARM though?

I don't know if it's useful or not. I'm sure removing it will makelittle difference, but since the VM flags are set by the calling code,the privcmd.c or higher level code would have to be correct for whateverarchitecture they are on. Not sure if it is "helpful" to allow certaincombinations in one arch, when it's not in another. My choice would beto keep the restriction until there is a good reason to remove it, butif you feel it is beneficial to remove it, feel free to say so.[Perhaps the ARM code should have a comment to the effect of "not neededon PVH or ARM, kept for compatibility with PVOPS on x86" - so when PVOPSis "retired" in some years time, it can be removed]

         data.fgmfn = mfn;
-       data.prot = prot;
+       data.prot  = prot;
         data.domid = domid;
-       data.vma = vma;
-       data.index = 0;
+       data.vma   = vma;


         data.pages = pages;
-       err = apply_to_page_range(vma->vm_mm, addr, nr << PAGE_SHIFT,
-                                 remap_pte_fn, &data);
-       return err;
+       data.index = 0;
+
+       while (nr) {
+               unsigned long range = 1 << PAGE_SHIFT;
+
+               err = apply_to_page_range(vma->vm_mm, addr, range,
+                                         remap_pte_fn, &data);
+               /* Warning: We do probably need to care about what error we
+                  get here. However, currently, the remap_area_mfn_pte_fn is

                                                        ^ this isn't the name 
of the fn

Fixed.

+                  only likely to return EFAULT or some other "things are very
+                  bad" error code, which the rest of the calling code won't
+                  be able to fix up. So we just exit with the error we got.

It expect it is more important to accumulate the individual errors from
remap_pte_fn into err_ptr.

Yes, but since that exits on error with EFAULT, the calling code won't"accept" the errors, and thus the whole house of cards fall apart atthis point.

There should probably be a task to fix this up properly, hence thecomment. But right now, any error besides ENOENT is "unacceptable" bythe callers of this code. If you want me to add this to the comment, I'mhappy to. But as long as remap_pte_fn returns EFAULT on error, nothingwill work after an error.

+               */
+               if (err)
+                       return err;
+
+               nr--;
+               addr += range;
+       }

This while loop (and therefore this change) is unnecessary. The single
call to apply_to_page_range is sufficient and as your TODO notes any
batching should happen in remap_pte_fn (which can handle both
accumulation and flushing when the  batch is large enough).

Ah, I see how you mean. I have updated the code accordingly.

+       return 0;
+}
+
+/* xen_remap_domain_mfn_array() - Used to map an array of foreign pages
+ * @vma:     the vma for the pages to be mapped into
+ * @addr:    the address at which to map the pages

physical address, right?

Virtual, at least if we assume that in " st->va & PAGE_MASK," 'va'actually means virtual address - it would be rather devious to keep usethe name va as a physical address - although I have seen such things inthe past.

I shall amend the comments to say such "virtual address" to be moreclear. [Not done in the attached patch, just realized this whenre-reading my comments that I probably should...]

+ * @mfn:     pointer to array of MFNs to map
+ * @nr:      the number entries in the MFN array
+ * @err_ptr: pointer to array of integers, one per MFN, for an error
+ *           value for each page. The err_ptr must not be NULL.

Nothing seems to be filling this in?

As discussed above (and below).

+ * @prot:    page protection mask
+ * @domid:   id of the domain that we are mapping from
+ * @pages:   page information (for PVH, not used currently)

No such thing as PVH on ARM. Also pages is used by this code.

Removed part in ().

+ *
+ * This function takes an array of mfns and maps nr pages from that into this
+ * kernel's memory. The owner of the pages is defined by domid. Where the pages
+ * are mapped is determined by addr, and vma is used for "accounting" of the
+ * pages. The err_ptr array is filled in on any page that is not sucessfully

                                                                   successfully

Thanks...


+ * mapped in.
+ *
+ * Return value is zero for success, negative ERRNO value for failure.
+ * Note that the error value -ENOENT is considered a "retry", so when this
+ * error code is seen, another call should be made with the list of pages that
+ * are marked as -ENOENT in the err_ptr array.
+ */
+int xen_remap_domain_mfn_array(struct vm_area_struct *vma,
+                              unsigned long addr,
+                              xen_pfn_t *mfn, int nr,
+                              int *err_ptr, pgprot_t prot,
+                              unsigned domid,
+                              struct page **pages)
+{
+       /* We BUG_ON because it's a programmer error to pass a NULL err_ptr,
+        * and the consequences later is quite hard to detect what the actual
+        * cause of "wrong memory was mapped in".
+        * Note: This variant doesn't actually use err_ptr at the moment.

True ;-)

Do you prefer the "not used" comment moved up a bit?

--
Mats

+        */
+       BUG_ON(err_ptr == NULL);
+       return do_remap_mfn(vma, addr, mfn, nr, prot, domid, pages);
+}
+EXPORT_SYMBOL_GPL(xen_remap_domain_mfn_array);
+
+/* Not used in ARM. Use xen_remap_domain_mfn_array(). */
+int xen_remap_domain_mfn_range(struct vm_area_struct *vma,
+                              unsigned long addr,
+                              xen_pfn_t mfn, int nr,
+                              pgprot_t prot, unsigned domid,
+                              struct page **pages)
+{
+       return -ENOSYS;
  }
  EXPORT_SYMBOL_GPL(xen_remap_domain_mfn_range);

+

  int xen_unmap_domain_mfn_range(struct vm_area_struct *vma,
                                int nr, struct page **pages)
  {

Attachment: 0001-Fixed-up-after-IanC-s-comments.patch
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] ARM fixes for my improved privcmd patch.
  - From: Ian Campbell

References:
- Re: [Xen-devel] ARM fixes for my improved privcmd patch.
  - From: Konrad Rzeszutek Wilk
- Re: [Xen-devel] ARM fixes for my improved privcmd patch.
  - From: Mats Petersson
- Re: [Xen-devel] ARM fixes for my improved privcmd patch.
  - From: Ian Campbell

Prev by Date: Re: [Xen-devel] Error in installing xen-4.2 on Debian
Next by Date: Re: [Xen-devel] ARM fixes for my improved privcmd patch.
Previous by thread: Re: [Xen-devel] ARM fixes for my improved privcmd patch.
Next by thread: Re: [Xen-devel] ARM fixes for my improved privcmd patch.
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.