| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
 Re: [XEN v5] xen/arm64: io: Decode ldr/str post-indexing instructions
 
 
Hi,
On 02/02/2022 13:05, Ayan Kumar Halder wrote:
 We should not need to decode the instruction twice. Instead, we could 
add the instruction details in vcpu_io as there can only be one I/O 
inflight per vCPU.
On 31/01/2022 19:37, Ayan Kumar Halder wrote:
 
diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
index 308650b400..f19fb46f72 100644
--- a/xen/arch/arm/ioreq.c
+++ b/xen/arch/arm/ioreq.c
@@ -23,16 +23,35 @@
    #include <public/hvm/ioreq.h>
  +#include "decode.h"
+
enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu 
*v) 
  {
      const union hsr hsr = { .bits = regs->hsr };
-    const struct hsr_dabt dabt = hsr.dabt;
+    struct hsr_dabt dabt = hsr.dabt;
+
      /* Code is similar to handle_read */
      register_t r = v->io.req.data;
        /* We are done with the IO */
      v->io.req.state = STATE_IOREQ_NONE;
  +    /*
+     * Note that we have already decoded the instruction in 
try_fwd_ioserv().
+     * We decode the instruction again to obtain rn and imm9. This 
will be used
+     * to do the post increment.
+     * Also there is no need to check whether the instruction can be 
decoded or
+     * was successfully decoded. The reason being if there was an 
error, then
+     * try_fwd_ioserv() would have returned error and this function 
would not
+     * have been called. Thus, there is an assumption that 
handle_iosev() is 
+     * invoked when try_fwd_ioserv() has returned successfully.
 
I am afraid this is not a correct assumption. Another vCPU can modify 
the instruction between the two decoding. So the right solution is to 
stash the information for latter consumption. 
 
+     */
+    if ( !dabt.valid )
+    {
+        decode_instruction(regs, &dabt);
+        post_increment_register(&dabt.dabt_instr);
+    }
+
      if ( dabt.write )
          return IO_HANDLED;
@@ -65,6 +84,8 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs 
*regs,
      };
      struct ioreq_server *s = NULL;
      enum io_state rc;
+    bool instr_decoded = false;
+    const union hsr hsr = { .bits = regs->hsr };
        if ( vio->req.state != STATE_IOREQ_NONE )
      {
@@ -76,8 +97,18 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs 
*regs,
      if ( !s )
          return IO_UNHANDLED;
  +    /*
+     * Armv8 processor does not provide a valid syndrome for 
decoding some
+     * instructions (for eg post-indexing ldr/str instructions). So 
in order to 
+     * process these instructions, Xen must decode them.
+     */
      if ( !info->dabt.valid )
-        return IO_ABORT;
+    {
+        rc = try_decode_instruction_invalid_iss(regs, &info->dabt);
+
+        if ( rc != IO_HANDLED)
+            return rc;
+    }
 
As you pointed out previously, the field SAS (Syndrome Access Size) is 
invalid when the ISV=0. So the decoding needs to be done *before* we 
select the IOREQ server. 
But as I said, this would result to decode the instruciton when this 
is not necessary. This is where Stefano's suggestion in [1] is useful. 
For ISV=0, it will be a lot more common to trap because of a P2M 
translation fault (of the MMIO is not mapped). So we should call that 
first and then, if it still not resolved, try to decode the instruction. 
With that in place, you are avoiding the issue in try_fwd_ioserv().
Can you please coordinate with Stefano?
 
I am a bit confused regarding where we need to handle to post increment 
of Rn in case of ioreq. 
I can see the following two places where PC gets incremented :-
1. handle_ioserv() returns IO_HANDLED via try_handle_mmio(). And then in 
"case IO_HANDLED:", PC is incremented. 
2. leave_hypervisor_to_guest() ---> check_for_vcpu_work() --> 
vcpu_ioreq_handle_completion() --> arch_ioreq_complete_mmio(). Here PC 
is incremented as well. 
So, do I need to update Rn in both the above places.
And if I understood your previous comment "Another vCPU can modify the 
instruction between the two decoding....", you are suggesting to save 
the instruction opcode (from PC) before invoking try_fwd_ioserv(). So, 
that it can be decoded again in arch_ioreq_complete_mmio() without 
reading PC.
 
Note that vcpu_io is an arch-agnostic structure. So you would want to 
embedded an arch specific structure (e.g. arch_vcpu_io) that would 
defined in arch/arm/include/asm/ioreq.h. 
For x86, you could define a dummy structure in arch/x86/include/asm/ioreq.h.
Cheers,
--
Julien Grall
 
 |