[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/setup: Remap Xen Identity Mapped RAM



On 07/08/14 08:57, Konrad Rzeszutek Wilk wrote:
Responding :-)

@@ -797,10 +794,9 @@ unsigned long __init set_phys_range_identity(unsigned long 
pfn_s,
                if (!__set_phys_to_machine(pfn, IDENTITY_FRAME(pfn)))
                        break;
-       if (!WARN((pfn - pfn_s) != (pfn_e - pfn_s),
+       WARN((pfn - pfn_s) != (pfn_e - pfn_s),
                "Identity mapping failed. We are %ld short of 1-1 mappings!\n",
-               (pfn_e - pfn_s) - (pfn - pfn_s)))
-               printk(KERN_DEBUG "1-1 mapping on %lx->%lx\n", pfn_s, pfn);
+               (pfn_e - pfn_s) - (pfn - pfn_s));
I would leave that as is. If you really want to modify it - spin another
patch for it.
The problem is because we now call set_phys_range_identity() on small blocks
the number of these messages printed is large. I can split it out to a
separate patch if you'd like.
Please do. Also you would have to rebase all this code on the latest Linus's
tree as there are some changes there.

I posted a v2 which split this patch out, addressed feedback and rebased on Linus's tree.


        return pfn - pfn_s;
  }
diff --git a/arch/x86/xen/p2m.h b/arch/x86/xen/p2m.h
new file mode 100644
index 0000000..9ce4d51
--- /dev/null
+++ b/arch/x86/xen/p2m.h
@@ -0,0 +1,15 @@
+#ifndef _XEN_P2M_H
+#define _XEM_P2M_H
+
+#define P2M_PER_PAGE        (PAGE_SIZE / sizeof(unsigned long))
+#define P2M_MID_PER_PAGE    (PAGE_SIZE / sizeof(unsigned long *))
+#define P2M_TOP_PER_PAGE    (PAGE_SIZE / sizeof(unsigned long **))
+
+#define MAX_P2M_PFN         (P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * 
P2M_PER_PAGE)
+
+#define MAX_REMAP_RANGES    10
Could you mention why 10 please?
10 is simply what I thought a reasonable max could ever be for the number of
remapped ranges. A range here is defined as the combination of a contiguous
e820 I/O range and a contiguous e820 RAM range where the underlying I/O
memory will be remapped. I'm not really excited about this but I don't have
a better solution. Any ideas? I believe the original implementation has a
similar issue that it appears to ignore.
Is that based on a worst case scenario? As in could you write in the comment
an example of a worst case E820 and scenario and when would 10 be more than
enough to cover it?


I changed reserved brk space slightly in my updated patch. Really MAX_REMAP_RANGES in the worst case should be E820MAX. That seems like an unrealistic worst case though and a lot of space to reserve. I'm fairly certain the existing code also has a similar worst case which it doesn't take into account. It currently seems to assume there is only one E820 I/O region that will be identity mapped? I think it works because in practice we don't come close to the worst case. I'm still not sure how to best handle this.

+       mod = remap_pfn % P2M_PER_PAGE;
+       remap_start_pfn_align =
+               mod ? (remap_pfn - mod + P2M_PER_PAGE):remap_pfn;
+       mod = (start_pfn + size) % P2M_PER_PAGE;
+       ident_end_pfn_align = start_pfn + size - mod;
+       mod = (remap_pfn + size) % P2M_PER_PAGE;
+       remap_end_pfn_align = remap_pfn + size - mod;
Should you do a check to make sure that 'ident_[start|end]_pfn_align'
don't overlap with 'remap_[start|end]_pfn_align' ?

+
+       /* Iterate over each p2m leaf node in each range */
+       for (ident_pfn_iter = ident_start_pfn_align, remap_pfn_iter = 
remap_start_pfn_align;
+            ident_pfn_iter < ident_end_pfn_align && remap_pfn_iter < 
remap_end_pfn_align;
+            ident_pfn_iter += P2M_PER_PAGE, remap_pfn_iter += P2M_PER_PAGE) {
+               /* Check we aren't past the end */
+               BUG_ON(ident_pfn_iter + P2M_PER_PAGE > start_pfn + size);
+               BUG_ON(remap_pfn_iter + P2M_PER_PAGE > remap_pfn + size);
Ah, here you check for it. But wouldn't it be easier to adjust the
remap_[start|end]_pfn_align above to check for this and make changes?
The ident_* and remap_* address ranges will never overlap each other unless
the e820 map is broken. I'm treating both ranges as independent and
Right.
iterating over each seperately at the same time. Each range will have
different boundary conditions potentially. Does that make sense? Am I
understanding the comment correctly?
I was thinking that if the E820 is broken then try our best (and stop
trying to remap) and continue. The E820 is guaranteed to be sane (no
overlapping regions), but the sizes could be bogus (zero size for example).

I think the current code does handle the zero case ok.

+
+               /* Save p2m mappings */
+               for (i = 0; i < P2M_PER_PAGE; i++)
+                       xen_remap_buf[i] = pfn_to_mfn(ident_pfn_iter + i);
+
+               /* Set identity map which also frees a p2m leaf */
                                                ^- might
Under what conditions is this not true? The goal of processing these in
blocks is to guarantee that a leaf is freed.
Perhaps change it 'also frees' to 'MUST free'
and then have (debug code) to double-check that Naturally
after calling the early_set_phys_identity?

        for (i = 0; i < P2M_PER_PAGE; i++) {
                unsigned int pfn = ident_pfn_iter + i;
                BUG_ON(pfn_to_mfn(pfn) != INVALID_P2M_ENTRY)
?
I added a debug check. It must check for the identity page and not INVALID_P2M_ENTRY however.

+               ident_cnt += set_phys_range_identity(ident_pfn_iter,
+                       ident_pfn_iter + P2M_PER_PAGE);
+
+               /* Now remap memory */
+               for (i = 0; i < P2M_PER_PAGE; i++) {
+                       unsigned long mfn = xen_remap_buf[i];
+                       struct mmu_update update = {
+                               .ptr = (mfn << PAGE_SHIFT) |
+                                       MMU_MACHPHYS_UPDATE,
+                               .val = remap_pfn_iter + i
+                       };
+
+                       /* Update p2m, will use the page freed above */
What page freed above?
See above comment.
Perhaps 'use the page free above' with 'use the P2M leaf freed above'.
ACK



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.