[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [XEN v3] xen/arm64: io: Decode ldr/str post-indexing instructions
On Thu, 20 Jan 2022 21:55:27 +0000 Ayan Kumar Halder <ayan.kumar.halder@xxxxxxxxxx> wrote: Hi, > At the moment, Xen is only handling data abort with valid syndrome (i.e. > ISV=0). Unfortunately, this doesn't cover all the instructions a domain > could use to access MMIO regions. > > For instance, a baremetal OS can use any of the following instructions, where > x1 contains the address of the MMIO region: > > 1. ldr x2, [x1], #4 That looks dodgy, since is misaligns the pointer afterwards. MMIO access typically go to device memory, which must be naturally aligned. Just don't give a bad example here and change that to a multiple of 8. > 2. ldr w2, [x1], #-4 (this one is fine, btw, because it's a 32-bit read) > 3. ldr x2, [x1], #-8 > 4. ldr w2, [x1], #4 > 5. ldrh w2, [x1], #8 > 6. ldrb w2, [x1], #16 More naturally I'd use the data size of the postindex value ... ldr x2 ... #-8 ldr w2 ... #4 ldrh w2 ... #2 ldrb w2 ... #1 > 7. str x2, [x1], #4 This is a again problematic, because x1 is not 8-byte aligned anymore after that. > 8. str w2, [x1], #-4 > 9. strh w2, [x1], #8 > 10. strb w2, [x1], #16 > > In the following two instructions, sp contains the address of the MMIO > region:- Really? I don't think you should give people funny ideas, just mention that the Rn register could theoretically be the stack pointer. > 11. ldrb w2, [sp], #16 > 12. ldrb wzr, [sp], #16 > > In order to handle post-indexing store/load instructions (like those mentioned > above), Xen will need to fetch and decode the instruction. > > This patch only cover post-index store/load instructions from AArch64 mode. > For now, this is left unimplemented for trap from AArch32 mode. > > Signed-off-by: Ayan Kumar Halder <ayankuma@xxxxxxxxxx> > --- > > Changelog :- > v2 - 1. Updated the rn register after reading from it. (Pointed by Julien, > Stefano) > 2. Used a union to represent the instruction opcode (Suggestd by > Bertrand) > 3. Fixed coding style issues (Pointed by Julien) > 4. In the previous patch, I was updating dabt->sign based on the > signedness > of imm9. This was incorrect. As mentioned in ARMv8 ARM DDI 0487G.b, > Page 3221, SSE indicates the signedness of the data item loaded. In > our > case, the data item loaded is always unsigned. > > v3- 1. Handled all the variants of ldr/str (ie 64, 32, 16, 8 bit variants). > Thus, I have removed the check for "instr->code.opc == 0" (Suggested by > Andre) > 2. Handled the scenario when rn = SP, rt = XZR (Suggested by Jan, Andre) > 3. Added restriction for "rt != rn" (Suggested by Andre) > 4. Moved union ldr_str_instr_class {} to decode.h. This is the header > included > by io.c and decode.c (where the union is referred). (Suggested by Jan) > 5. Indentation and typo fixes (Suggested by Jan) > > Changes suggested but could not be considered due to reasons :- > 1. Using accessor macros instead of bitfields for "ldr_str_instr_class". > (Andre) > Reason - I could not find a simple way to represent 9 bit signed > integer > (ie imm9) without using bitfields. If I use accessor macros, then I > need > to manually calculate two's complement to obtain the value when signed > bit is present. > > 2. I/D cache cohenerncy (Andre) > Reason :- I could not see any instruction to flush the I cache. First, please try to avoid the term "flush", because it is somewhat overloaded. The architecture speaks of "clean" and "invalidate", which are more precise. Assuming you mean "clean" here: conceptually there is no such thing for the I cache, because it's always clean. The I$ will only be read from the CPU side - from the instruction fetcher - there is nothing written back through it. Every store goes through the data path - always. That is the problem that I tried to sketch you previously: you don't have a guarantee that the instruction you read from memory is the same that the CPU executed. The guest could have changed the instruction after the I$ fetched that. So the CPU will execute (and trap) on instruction X, but you will read Y. I leave it up to your imagination if that could be exploited. > Refer > https://developer.arm.com/documentation/ddi0596/2021-12/Base-Instructions/IC--Instruction-Cache-operation--an-alias-of-SYS-?lang=en#sa_ic_op > So, this patch assumes that the I/D caches are coherent. Bold. ;-) Cheers, Andre > > xen/arch/arm/decode.c | 78 ++++++++++++++++++++++++++++++++++++++++++- > xen/arch/arm/decode.h | 29 +++++++++++++++- > xen/arch/arm/io.c | 66 ++++++++++++++++++++++++++++++++---- > 3 files changed, 165 insertions(+), 8 deletions(-) > > diff --git a/xen/arch/arm/decode.c b/xen/arch/arm/decode.c > index 792c2e92a7..f1c59ddd1a 100644 > --- a/xen/arch/arm/decode.c > +++ b/xen/arch/arm/decode.c > @@ -84,6 +84,76 @@ bad_thumb2: > return 1; > } > > +static int decode_loadstore_postindexing(register_t pc, > + struct hsr_dabt *dabt, > + union ldr_str_instr_class *instr) > +{ > + if ( raw_copy_from_guest(&instr->value, (void * __user)pc, sizeof > (instr)) ) > + return -EFAULT; > + > + /* > + * Rn -ne Rt for ldr/str instruction. > + * Check https://developer.arm.com/documentation/dui0802/a/CIHGJHED > + * (Register restrictions) > + * > + * The only exception for this is when rn = 31. It denotes SP ("Use of > SP") > + * > + * And when rt = 31, it denotes wzr/xzr. (Refer > + * > https://developer.arm.com/documentation/den0024/a/ARMv8-Registers/AArch64-special-registers > + * "There is no register called X31 or W31. Many instructions are encoded > + * such that the number 31 represents the zero register, ZR (WZR/XZR)." > + */ > + if ( (instr->code.rn == instr->code.rt) && (instr->code.rn != 31) ) > + return -EINVAL; > + > + /* First, let's check for the fixed values */ > + if ( !((instr->code.fixed1 == 1) && (instr->code.fixed2 == 0) && > + (instr->code.fixed3 == 0) && (instr->code.fixed4 == 7)) ) > + { > + gprintk(XENLOG_ERR, "Cannot decode instruction 0x%x",instr->value); > + gprintk(XENLOG_ERR, "Decoding not supported for instructions other > than" > + " ldr/str post indexing\n"); > + goto bad_32bit_loadstore; > + } > + > + if ( instr->code.v != 0 ) > + { > + gprintk(XENLOG_ERR, > + "ldr/str post indexing for vector types are not supported\n"); > + goto bad_32bit_loadstore; > + } > + > + /* Check for STR (immediate) - 32 bit variant */ > + if ( instr->code.opc == 0 ) > + { > + dabt->write = 1; > + } > + /* Check for LDR (immediate) - 32 bit variant */ > + else if ( instr->code.opc == 1 ) > + { > + dabt->write = 0; > + } > + else > + { > + gprintk(XENLOG_ERR, > + "Decoding ldr/str post indexing is not supported for this > variant\n"); > + goto bad_32bit_loadstore; > + } > + > + gprintk(XENLOG_INFO, > + "instr->code.rt = 0x%x, instr->code.size = 0x%x, instr->code.imm9 = > %d\n", > + instr->code.rt, instr->code.size, instr->code.imm9); > + > + update_dabt(dabt, instr->code.rt, instr->code.size, false); > + dabt->valid = 1; > + > + return 0; > + > + bad_32bit_loadstore: > + gprintk(XENLOG_ERR, "unhandled 32bit Arm instruction 0x%x\n", > instr->value); > + return 1; > +} > + > static int decode_thumb(register_t pc, struct hsr_dabt *dabt) > { > uint16_t instr; > @@ -150,11 +220,17 @@ bad_thumb: > return 1; > } > > -int decode_instruction(const struct cpu_user_regs *regs, struct hsr_dabt > *dabt) > +int decode_instruction(const struct cpu_user_regs *regs, struct hsr_dabt > *dabt, > + union ldr_str_instr_class *instr) > { > if ( is_32bit_domain(current->domain) && regs->cpsr & PSR_THUMB ) > return decode_thumb(regs->pc, dabt); > > + if ( (is_64bit_domain(current->domain) && !psr_mode_is_32bit(regs)) ) > + { > + return decode_loadstore_postindexing(regs->pc, dabt, instr); > + } > + > /* TODO: Handle ARM instruction */ > gprintk(XENLOG_ERR, "unhandled ARM instruction\n"); > > diff --git a/xen/arch/arm/decode.h b/xen/arch/arm/decode.h > index 4613763bdb..5c918c9bed 100644 > --- a/xen/arch/arm/decode.h > +++ b/xen/arch/arm/decode.h > @@ -23,6 +23,32 @@ > #include <asm/regs.h> > #include <asm/processor.h> > > +/* > + * Refer to the ARMv8 ARM (DDI 0487G.b), Section C4.1.4 Loads and Stores > + * Page 318 specifies the following bit pattern for > + * "load/store register (immediate post-indexed)". > + * > + * 31 30 29 27 26 25 23 21 20 11 9 4 0 > + * ___________________________________________________________________ > + * |size|1 1 1 |V |0 0 |opc |0 | imm9 |0 1 | Rn | Rt | > + * |____|______|__|____|____|__|_______________|____|_________|_______| > + */ > +union ldr_str_instr_class { > + uint32_t value; > + struct ldr_str { > + unsigned int rt:5; /* Rt register */ > + unsigned int rn:5; /* Rn register */ > + unsigned int fixed1:2; /* value == 01b */ > + signed int imm9:9; /* imm9 */ > + unsigned int fixed2:1; /* value == 0b */ > + unsigned int opc:2; /* opc */ > + unsigned int fixed3:2; /* value == 00b */ > + unsigned int v:1; /* vector */ > + unsigned int fixed4:3; /* value == 111b */ > + unsigned int size:2; /* size */ > + } code; > +}; > + > /** > * Decode an instruction from pc > * /!\ This function is not intended to fully decode an instruction. It > @@ -35,7 +61,8 @@ > */ > > int decode_instruction(const struct cpu_user_regs *regs, > - struct hsr_dabt *dabt); > + struct hsr_dabt *dabt, > + union ldr_str_instr_class *instr); > > #endif /* __ARCH_ARM_DECODE_H_ */ > > diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c > index 729287e37c..acb483f235 100644 > --- a/xen/arch/arm/io.c > +++ b/xen/arch/arm/io.c > @@ -65,6 +65,39 @@ static enum io_state handle_write(const struct > mmio_handler *handler, > return ret ? IO_HANDLED : IO_ABORT; > } > > +static void post_increment_register(union ldr_str_instr_class *instr) > +{ > + struct cpu_user_regs *regs = guest_cpu_user_regs(); > + unsigned int val; > + > + /* handle when rn = SP */ > + if ( instr->code.rn == 31 ) > + { > + if ( (regs->cpsr & PSR_MODE_MASK) == PSR_MODE_EL1h ) > + { > + val = regs->sp_el1; > + } > + else > + { > + BUG(); > + } > + } > + else > + { > + val = get_user_reg(regs, instr->code.rn); > + } > + val += instr->code.imm9; > + > + if ( instr->code.rn == 31 ) > + { > + regs->sp_el1 = val; > + } > + else > + { > + set_user_reg(regs, instr->code.rn, val); > + } > +} > + > /* This function assumes that mmio regions are not overlapped */ > static int cmp_mmio_handler(const void *key, const void *elem) > { > @@ -106,14 +139,29 @@ enum io_state try_handle_mmio(struct cpu_user_regs > *regs, > .gpa = gpa, > .dabt = dabt > }; > + int rc; > + union ldr_str_instr_class instr = {0}; > > ASSERT(hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL); > > + /* > + * Armv8 processor does not provide a valid syndrome for post-indexing > + * ldr/str instructions. So in order to process these instructions, > + * Xen must decode them. > + */ > + if ( !info.dabt.valid ) > + { > + rc = decode_instruction(regs, &info.dabt, &instr); > + if ( rc ) > + { > + gprintk(XENLOG_DEBUG, "Unable to decode instruction\n"); > + return IO_ABORT; > + } > + } > + > handler = find_mmio_handler(v->domain, info.gpa); > if ( !handler ) > { > - int rc; > - > rc = try_fwd_ioserv(regs, v, &info); > if ( rc == IO_HANDLED ) > return handle_ioserv(regs, v); > @@ -122,7 +170,7 @@ enum io_state try_handle_mmio(struct cpu_user_regs *regs, > } > > /* All the instructions used on emulated MMIO region should be valid */ > - if ( !dabt.valid ) > + if ( !info.dabt.valid ) > return IO_ABORT; > > /* > @@ -134,7 +182,7 @@ enum io_state try_handle_mmio(struct cpu_user_regs *regs, > { > int rc; > > - rc = decode_instruction(regs, &info.dabt); > + rc = decode_instruction(regs, &info.dabt, NULL); > if ( rc ) > { > gprintk(XENLOG_DEBUG, "Unable to decode instruction\n"); > @@ -143,9 +191,15 @@ enum io_state try_handle_mmio(struct cpu_user_regs *regs, > } > > if ( info.dabt.write ) > - return handle_write(handler, v, &info); > + rc = handle_write(handler, v, &info); > else > - return handle_read(handler, v, &info); > + rc = handle_read(handler, v, &info); > + > + if ( instr.value != 0 ) > + { > + post_increment_register(&instr); > + } > + return rc; > } > > void register_mmio_handler(struct domain *d,
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |