[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 03/18] arm/xen,arm64/xen: introduce p2m



On Thu, 7 Nov 2013, Ian Campbell wrote:
> On Fri, 2013-10-25 at 11:51 +0100, Stefano Stabellini wrote:
> > Introduce physical to machine and machine to physical tracking
> > mechanisms based on rbtrees for arm/xen and arm64/xen.
> > 
> > We need it because any guests on ARM are an autotranslate guests,
> > therefore a physical address is potentially different from a machine
> > address. When programming a device to do DMA, we need to be
> > extra-careful to use machine addresses rather than physical addresses to
> > program the device. Therefore we need to know the physical to machine
> > mappings.
> > 
> > For the moment we assume that dom0 starts with a 1:1 physical to machine
> > mapping, in other words physical addresses correspond to machine
> > addresses. However when mapping a foreign grant reference, obviously the
> > 1:1 model doesn't work anymore. So at the very least we need to be able
> > to track grant mappings.
> > 
> > We need locking to protect accesses to the two trees.
> > 
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
> > 
> > Changes in v8:
> > - move pfn_to_mfn and mfn_to_pfn to page.h as static inline functions;
> > - no need to walk the tree if phys_to_mach.rb_node is NULL;
> > - correctly handle multipage p2m entries;
> > - substitute the spin_lock with a rwlock.
> > ---
> >  arch/arm/include/asm/xen/page.h |   49 ++++++++--
> >  arch/arm/xen/Makefile           |    2 +-
> >  arch/arm/xen/p2m.c              |  208 
> > +++++++++++++++++++++++++++++++++++++++
> >  arch/arm64/xen/Makefile         |    2 +-
> >  4 files changed, 252 insertions(+), 9 deletions(-)
> >  create mode 100644 arch/arm/xen/p2m.c
> > 
> > diff --git a/arch/arm/include/asm/xen/page.h 
> > b/arch/arm/include/asm/xen/page.h
> > index 359a7b5..d1b5dd5 100644
> > --- a/arch/arm/include/asm/xen/page.h
> > +++ b/arch/arm/include/asm/xen/page.h
> > @@ -7,11 +7,10 @@
> >  #include <linux/pfn.h>
> >  #include <linux/types.h>
> >  
> > +#include <xen/xen.h>
> >  #include <xen/interface/grant_table.h>
> >  
> > -#define pfn_to_mfn(pfn)                    (pfn)
> >  #define phys_to_machine_mapping_valid(pfn) (1)
> > -#define mfn_to_pfn(mfn)                    (mfn)
> >  #define mfn_to_virt(m)                     (__va(mfn_to_pfn(m) << 
> > PAGE_SHIFT))
> >  
> >  #define pte_mfn        pte_pfn
> > @@ -32,6 +31,44 @@ typedef struct xpaddr {
> >  
> >  #define INVALID_P2M_ENTRY      (~0UL)
> >  
> > +unsigned long __pfn_to_mfn(unsigned long pfn);
> > +unsigned long __mfn_to_pfn(unsigned long mfn);
> > +extern struct rb_root phys_to_mach;
> > +
> > +static inline unsigned long pfn_to_mfn(unsigned long pfn)
> > +{
> > +   unsigned long mfn;
> > +   
> > +   if (phys_to_mach.rb_node != NULL) {
> > +           mfn = __pfn_to_mfn(pfn);
> > +           if (mfn != INVALID_P2M_ENTRY)
> > +                   return mfn;
> > +   }
> > +
> > +   if (xen_initial_domain())
> > +           return pfn;
> > +   else
> > +           return INVALID_P2M_ENTRY;
> 
> This breaks domU ballooning (for some reason only on 64-bit, I've no
> clue why not 32-bit).
> 
> decrease_reservation does pfn_to_mfn in order to release the page, and
> this ends up passing INVALID_P2M_ENTRY, which the hypervisor rightly
> rejects.
>
> > +static inline unsigned long mfn_to_pfn(unsigned long mfn)
> > +{
> > +   unsigned long pfn;
> > +
> > +   if (phys_to_mach.rb_node != NULL) {
> > +           pfn = __mfn_to_pfn(mfn);
> > +           if (pfn != INVALID_P2M_ENTRY)
> > +                   return pfn;
> > +   }
> > +
> > +   if (xen_initial_domain())
> > +           return mfn;
> > +   else
> > +           return INVALID_P2M_ENTRY;
> 
> Same here I think.
> 
> Both of these should unconditionally return their inputs for !
> xen_initial_domain() I think.
 
The swiotlb code doesn't actually need pfn_to_mfn or mfn_to_pfn to
return invalid. I did that to be able to spot cases where the generic
code might do something that doesn't work properly with autotranslate
guests. Although I still believe in the original concept I guess it
might have been a bit premature. I'll fix the two function to return
their input if phys_to_mach.rb_node != NULL and come up with a better
patch later on.


> I presume hys_to_mach.rb_node != NULL should never trigger for a domU?

Theoretically it shouldn't but I would like to keep phys_to_mach.rb_node
decoupled from concepts like dom0 and domU. I know that today we
wouldn't use it unless we need to map foreign grants, and that would
happen only in dom0, but I would rather keep the code more flexible
retaining the phys_to_mach.rb_node != NULL check.


> > +static int xen_add_phys_to_mach_entry(struct xen_p2m_entry *new)
> > +{
> > +   struct rb_node **link = &phys_to_mach.rb_node;
> > +   struct rb_node *parent = NULL;
> > +   struct xen_p2m_entry *entry;
> > +   int rc = 0;
> > +
> > +   while (*link) {
> > +           parent = *link;
> > +           entry = rb_entry(parent, struct xen_p2m_entry, rbnode_phys);
> > +
> > +           if (new->mfn == entry->mfn)
> > +                   goto err_out;
> > +           if (new->pfn == entry->pfn)
> > +                   goto err_out;
> > +
> > +           if (new->pfn < entry->pfn)
> > +                   link = &(*link)->rb_left;
> > +           else
> > +                   link = &(*link)->rb_right;
> > +   }
> 
> Are there really no helpers for walking an rbtree?

None that I could find.


> > +   rb_link_node(&new->rbnode_phys, parent, link);
> > +   rb_insert_color(&new->rbnode_phys, &phys_to_mach);
> > +   goto out;
> > +
> > +err_out:
> > +   rc = -EINVAL;
> > +   pr_warn("%s: cannot add pfn=%pa -> mfn=%pa: pfn=%pa -> mfn=%pa already 
> > exists\n",
> > +                   __func__, &new->pfn, &new->mfn, &entry->pfn, 
> > &entry->mfn);
> > +out:
> > +   return rc;
> > +}
> > +
> > +unsigned long __pfn_to_mfn(unsigned long pfn)
> > +{
> > +   struct rb_node *n = phys_to_mach.rb_node;
> > +   struct xen_p2m_entry *entry;
> > +   unsigned long irqflags;
> > +
> > +   read_lock_irqsave(&p2m_lock, irqflags);
> > +   while (n) {
> > +           entry = rb_entry(n, struct xen_p2m_entry, rbnode_phys);
> > +           if (entry->pfn <= pfn &&
> > +                           entry->pfn + entry->nr_pages > pfn) {
> > +                   read_unlock_irqrestore(&p2m_lock, irqflags);
> > +                   return entry->mfn + (pfn - entry->pfn);
> > +           }
> > +           if (pfn < entry->pfn)
> > +                   n = n->rb_left;
> > +           else
> > +                   n = n->rb_right;
> > +   }
> > +   read_unlock_irqrestore(&p2m_lock, irqflags);
> > +
> > +   return INVALID_P2M_ENTRY;
> > +}
> > +EXPORT_SYMBOL_GPL(__pfn_to_mfn);
> > +
> > +static int xen_add_mach_to_phys_entry(struct xen_p2m_entry *new)
> > +{
> > +   struct rb_node **link = &mach_to_phys.rb_node;
> > +   struct rb_node *parent = NULL;
> > +   struct xen_p2m_entry *entry;
> > +   int rc = 0;
> > +
> > +   while (*link) {
> > +           parent = *link;
> > +           entry = rb_entry(parent, struct xen_p2m_entry, rbnode_mach);
> > +
> > +           if (new->mfn == entry->mfn)
> > +                   goto err_out;
> > +           if (new->pfn == entry->pfn)
> > +                   goto err_out;
> > +
> > +           if (new->mfn < entry->mfn)
> > +                   link = &(*link)->rb_left;
> > +           else
> > +                   link = &(*link)->rb_right;
> > +   }
> 
> This looks close to identical to the one in xen_add_phys_to_mach_entry.
> You could combine them with a simple "lookup mfn" boolean.
> 
> 
> > +   rb_link_node(&new->rbnode_mach, parent, link);
> > +   rb_insert_color(&new->rbnode_mach, &mach_to_phys);
> > +   goto out;
> > +
> > +err_out:
> > +   rc = -EINVAL;
> > +   pr_warn("%s: cannot add pfn=%pa -> mfn=%pa: pfn=%pa -> mfn=%pa already 
> > exists\n",
> > +                   __func__, &new->pfn, &new->mfn, &entry->pfn, 
> > &entry->mfn);
> > +out:
> > +   return rc;
> > +}
> > +
> > +unsigned long __mfn_to_pfn(unsigned long mfn)
> > +{
> > +   struct rb_node *n = mach_to_phys.rb_node;
> > +   struct xen_p2m_entry *entry;
> > +   unsigned long irqflags;
> > +
> > +   read_lock_irqsave(&p2m_lock, irqflags);
> > +   while (n) {
> > +           entry = rb_entry(n, struct xen_p2m_entry, rbnode_mach);
> > +           if (entry->mfn <= mfn &&
> > +                           entry->mfn + entry->nr_pages > mfn) {
> > +                   read_unlock_irqrestore(&p2m_lock, irqflags);
> > +                   return entry->pfn + (mfn - entry->mfn);
> > +           }
> > +           if (mfn < entry->mfn)
> > +                   n = n->rb_left;
> > +           else
> > +                   n = n->rb_right;
> > +   }
> 
> and this looks basically identical to __pfn_to_mfn in the same way.
> 
> > +   read_unlock_irqrestore(&p2m_lock, irqflags);
> > +
> > +   return INVALID_P2M_ENTRY;
> > +}
> > +EXPORT_SYMBOL_GPL(__mfn_to_pfn);
> > +
> 
> Ian.
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.