[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 18/22] xen/arm: p2m: Rework the context switch to another VTTBR in flush_tlb_domain



Hi Stefano,

On 27/07/16 02:12, Stefano Stabellini wrote:
On Wed, 20 Jul 2016, Julien Grall wrote:
The current implementation of flush_tlb_domain is relying on the domain
to have a single p2m. With the upcoming feature altp2m, a single domain
may have different p2m. So we would need to switch to the correct p2m in
order to flush the TLBs.

Rather than checking whether the domain is not the current domain, check
whether the VTTBR is different. The resulting assembly code is much
smaller: from 38 instructions (+ 2 functions call) to 22 instructions.

That's true but SYSREG reads are more expensive than regular
instructions.

This argument is not really true. The ARM ARM (D7-1879 in ARM DDI 0487A.j) says: "Reads of the System registers can occur out of order with respect to earlier instructions executed on the same PE, provided that any data dependencies between the instructions are respected". So It will depend on how the micro-architecture implemented access to SYSREG.

However, the current code already contains plenty of SYSREG read access (via the macro current using TPIDR_EL2). So the number of SYSREG accesses stay exactly the same.

I also forgot to mention that the number of instructions in the function call (10 instructions). So we are down from 58 instructions to 22 instructions.

Therefore, smaller code and likely better performance.



Signed-off-by: Julien Grall <julien.grall@xxxxxxx>
---
 xen/arch/arm/p2m.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index d1b6009..015c1e8 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -151,24 +151,28 @@ void p2m_restore_state(struct vcpu *n)

 void flush_tlb_domain(struct domain *d)
 {
+    struct p2m_domain *p2m = &d->arch.p2m;
     unsigned long flags = 0;
+    uint64_t ovttbr;

     /*
-     * Update the VTTBR if necessary with the domain d. In this case,
-     * it's only necessary to flush TLBs on every CPUs with the current VMID
-     * (our domain).
+     * ARM only provides an instruction to flush TLBs for the current
+     * VMID. So switch to the VTTBR of a given P2M if different.
      */
-    if ( d != current->domain )
+    ovttbr = READ_SYSREG64(VTTBR_EL2);
+    if ( ovttbr != p2m->vttbr )
     {
         local_irq_save(flags);
-        p2m_load_VTTBR(d);
+        WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
+        isb();
     }

     flush_tlb();

-    if ( d != current->domain )
+    if ( ovttbr != READ_SYSREG64(VTTBR_EL2) )

You should be able to remove this second SYSREG read and optimize the
code further.

I should be able, however I think it will not bring much more optimization here but obfuscating a bit more the code.

Regards,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.