[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: GICv3: Aarch32: Need guidance on the atomic access of "union host_lpi" or if ITS is supported on R52
On 27/10/2022 15:36, Andre Przywara wrote: On Thu, 27 Oct 2022 14:38:52 +0100 Ayan Kumar Halder <ayankuma@xxxxxxx> wrote: Hi Ayan, Hi Andre / Julien, On 25/10/2022 14:55, Andre Przywara wrote:On Tue, 25 Oct 2022 13:25:52 +0100 Ayan Kumar Halder <ayankuma@xxxxxxx> wrote: Hi,Hi Andre, Many thanks for the explanation. I need a clarification on the issue of atomic access to 64bit normal memory on R52.Hi Andre/All, This came up while porting Xen on R52. Refer "ARM DDI 0568A.cID110520", B1.3.1 "A Generic Interrupt Controller (GIC) implemented with an Armv8-R PE must not implement LPI support. " Does this mean ITS is not supported on R52 ? I am assuming yes, please correct me if mistaken.An ITS relies on LPIs, so yes: no ITS on a v8-R32 system. I cannot find this restriction anymore in the v8-R64 supplement, so it would only apply to the R52/AArch32.If the answer is no, then my next query is follows :-Answering to that anyway ...diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c index 9ca74bc321..dea60aac0d 100644 --- a/xen/arch/arm/gic-v3-lpi.c +++ b/xen/arch/arm/gic-v3-lpi.c @@ -423,7 +423,7 @@ int gicv3_lpi_init_host_lpis(unsigned int host_lpi_bits) int rc; /* We rely on the data structure being atomically accessible. */ - BUILD_BUG_ON(sizeof(union host_lpi) > sizeof(unsigned long)); + BUILD_BUG_ON(sizeof(union host_lpi) > sizeof(uint64_t)); "unsigned long" on Aarch32 is 32 bits. So this bug gets triggered. Is it fine to change it as above ? Reading"ARM DDI 0487G.bID07202", E2.2.1, "Requirements for single-copy atomicity". "LDRD and STRD accesses to 64-bit aligned locations are 64-bit single-copy atomic as seen by translation table walks and accesses to translation tables"This (and the explaining paragraph) indeed suggests that this should work architecturally, if you use normal system memory (where you would hold page tables). It would be confined to ARMv8 AArch32 and ARMv7 w/ LPAE, which matches Xen's requirements.Does it mean that ldrd/strd will not work atomically on AArch32-v8R as it uses MPU (not MMU, so no page tables) ?No, this mentioning of page tables is more an example or a rationale, than a requirement. What this means (in the ARMv7-A/ARMv8-A AArch32 context) it: Because on v7A-LPAE and v8-AArch32 PTEs are 64 bits wide, it's too painful to use explicit locking to make sure just writing one PTE is atomic. So the architecture demands that 64-bit aligned accesses using ldrd/strd are single-copy atomic, so software can update just one PTE easily. But this is only required for locations where page tables typically reside, so system memory. This avoids this 64-bit atomicity requirement for *every* part of the system, for instance separate buses, SRAM or flash on smaller buses, or MMIO in general. I don't find anything in the v8-R32 supplement that would step back from this requirement, although indeed the original reason (atomic PTE writes) would not apply to v8-R32. Both the LDRD/STRD description and the section listing differences in the system memory architecture do not mention anything, so I'd say that the ldrd atomicity requirement still holds. Please note that this only applies to ARMv7 *LPAE* systems, but Xen requires LPAE, and R52 is v8, so we are good, and the Xen code can rely on this. So for Xen on ARMv8-R32: *LDRD/STRD* accesses to *64-bit aligned* addresses in *RAM* would be atomic. You need to satisfy all three requirements: - You must use ldrd/strd. Just dereferencing a uint64_t pointer in C does not guarantee that, but read_atomic()/write_atomic() does. - It must be 64-bit aligned. Shouldn't be a problem if the data type is 64 bits wide. Please note the slight nastiness that ldrd would silently work on non-aligned addresses, but would lose the atomicity guarantee. ldrexd would always fault if the address is not aligned. We might want to check the alignment of data we access (assert?), if not done already. - It must be in system RAM, so not MMIO. Also I think TCM might be a different story, but I would hope Xen would not use that directly. Many thanks for the nice explanation.I am trying to compare this with the atomicity requirement for AArch64 (ARM DDI 0487G.b ID072021, B2.2.1 Requirements for single-copy atomicit ) I seethat the alignment requirement is the same as for ARMv8-R32."-A read that is generated by a load instruction that loads a single general-purpose register and is aligned to the size of the read in the instruction is single-copy atomic. -A write that is generated by a store instruction that stores a single general-purpose register and is aligned to the size of the write in the instruction is single-copy atomic" I think the following code change should help us to confirm the correct behavior of atomic read/write on both AArch64 and AArch32 (including R52). diff --git a/xen/arch/arm/include/asm/atomic.h b/xen/arch/arm/include/asm/atomic.h index ac2798d095..f22c65a853 100644 --- a/xen/arch/arm/include/asm/atomic.h +++ b/xen/arch/arm/include/asm/atomic.h@@ -78,6 +78,7 @@ static always_inline void read_atomic_size(const volatile void *p, void *res, unsigned int size) { + ASSERT(IS_ALIGNED((unsigned long)p, size)); switch ( size ) { case 1:@@ -102,6 +103,7 @@ static always_inline void write_atomic_size(volatile void *p, void *val, unsigned int size) { + ASSERT(IS_ALIGNED((unsigned long)p, size)); switch ( size ) { case 1: Please let me know if I misunderstand something. - Ayan If so, then is using ldrexd/strexd the solution for this ?As mentioned above, you would not need that, just read_atomic()/write_atomic() would do. Hope that clears that up. Cheers, Andre P.S. This above is my reading of the ARM ARM and the R32 supplement. I can double check with the architects, but this might take a while.IIUC "Memory accesses caused by LDREXD and STREXD instructions to doubleword-aligned locations.", then the answer seems yes. - AyanBut it's only atomic if you are using ldrd/strd, which you cannot know for sure in C, because it's up to the compiler to generate the instructions. This is why we have that test. Changing the unsigned long to uint64_t would make the check pointless, since the data structure is 64-bits long, so it would always be true. So given that you don't seem to need it, right now, it would leave the test alone. If you need that on AArch32 anyway, you would need to replace accesses to the host_lpis array with inline assembly accessors, to ensure ldrd/strd instructions. This seems doable (there are only so many places which directly access the array members), but would need a good use case. Cheers, AndreDoes this imply that atomicity will be retained (with the above change) ? Os will this require ldrexd/strexd as R52 supports MPU (not MMU, so translation tables are irrelevant). Kind regards, Ayan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |