[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GICv3: Aarch32: Need guidance on the atomic access of "union host_lpi" or if ITS is supported on R52



On Fri, 28 Oct 2022 12:44:08 +0100
Ayan Kumar Halder <ayankuma@xxxxxxx> wrote:

> On 27/10/2022 15:36, Andre Przywara wrote:
> > On Thu, 27 Oct 2022 14:38:52 +0100
> > Ayan Kumar Halder <ayankuma@xxxxxxx> wrote:
> >
> > Hi Ayan,  
> Hi Andre / Julien,
> >  
> >> On 25/10/2022 14:55, Andre Przywara wrote:  
> >>> On Tue, 25 Oct 2022 13:25:52 +0100
> >>> Ayan Kumar Halder <ayankuma@xxxxxxx> wrote:
> >>>
> >>> Hi,  
> >> Hi Andre,
> >>
> >> Many thanks for the explanation.
> >>
> >> I need a clarification on the issue of atomic access to 64bit normal
> >> memory on R52.
> >>  
> >>>     
> >>>> Hi Andre/All,
> >>>>
> >>>> This came up while porting Xen on R52.
> >>>>
> >>>> Refer "ARM DDI 0568A.cID110520", B1.3.1
> >>>>
> >>>> "A Generic Interrupt Controller (GIC) implemented with an Armv8-R PE
> >>>> must not implement LPI support. "
> >>>>
> >>>> Does this mean ITS is not supported on R52 ? I am assuming yes, please
> >>>> correct me if mistaken.  
> >>> An ITS relies on LPIs, so yes: no ITS on a v8-R32 system. I cannot find
> >>> this restriction anymore in the v8-R64 supplement, so it would only apply
> >>> to the R52/AArch32.
> >>>     
> >>>> If the answer is no, then my next query is follows :-  
> >>> Answering to that anyway ...
> >>>     
> >>>> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> >>>> index 9ca74bc321..dea60aac0d 100644
> >>>> --- a/xen/arch/arm/gic-v3-lpi.c
> >>>> +++ b/xen/arch/arm/gic-v3-lpi.c
> >>>> @@ -423,7 +423,7 @@ int gicv3_lpi_init_host_lpis(unsigned int 
> >>>> host_lpi_bits)
> >>>>         int rc;
> >>>>
> >>>>         /* We rely on the data structure being atomically accessible. */
> >>>> -    BUILD_BUG_ON(sizeof(union host_lpi) > sizeof(unsigned long));
> >>>> +    BUILD_BUG_ON(sizeof(union host_lpi) > sizeof(uint64_t));
> >>>>
> >>>> "unsigned long" on Aarch32 is 32 bits. So this bug gets triggered.
> >>>>
> >>>> Is it fine to change it as above ?
> >>>>
> >>>> Reading"ARM DDI 0487G.bID07202", E2.2.1, "Requirements for single-copy
> >>>> atomicity".
> >>>>
> >>>> "LDRD and STRD accesses to 64-bit aligned locations are 64-bit
> >>>> single-copy atomic as seen by translation table walks and accesses to
> >>>> translation tables"  
> >>> This (and the explaining paragraph) indeed suggests that this should
> >>> work architecturally, if you use normal system memory (where you would
> >>> hold page tables). It would be confined to ARMv8 AArch32 and ARMv7 w/
> >>> LPAE, which matches Xen's requirements.  
> >> Does it mean that ldrd/strd will not work atomically on AArch32-v8R as
> >> it uses MPU (not MMU, so no page tables) ?  
> > No, this mentioning of page tables is more an example or a rationale, than
> > a requirement.
> > What this means (in the ARMv7-A/ARMv8-A AArch32 context) it:
> > Because on v7A-LPAE and v8-AArch32 PTEs are 64 bits wide, it's too painful
> > to use explicit locking to make sure just writing one PTE is atomic. So
> > the architecture demands that 64-bit aligned accesses using ldrd/strd
> > are single-copy atomic, so software can update just one PTE easily. But
> > this is only required for locations where page tables typically reside, so
> > system memory. This avoids this 64-bit atomicity requirement for *every*
> > part of the system, for instance separate buses, SRAM or flash on smaller
> > buses, or MMIO in general.
> >
> > I don't find anything in the v8-R32 supplement that would step back from
> > this requirement, although indeed the original reason (atomic PTE writes)
> > would not apply to v8-R32. Both the LDRD/STRD description and the section
> > listing differences in the system memory architecture do not mention
> > anything, so I'd say that the ldrd atomicity requirement still holds.
> >
> > Please note that this only applies to ARMv7 *LPAE* systems, but Xen
> > requires LPAE, and R52 is v8, so we are good, and the Xen code can rely on
> > this.
> >
> > So for Xen on ARMv8-R32:
> > *LDRD/STRD* accesses to *64-bit aligned* addresses in *RAM* would be
> > atomic. You need to satisfy all three requirements:
> > - You must use ldrd/strd. Just dereferencing a uint64_t pointer in C does
> > not guarantee that, but read_atomic()/write_atomic() does.
> > - It must be 64-bit aligned. Shouldn't be a problem if the data type is
> > 64 bits wide. Please note the slight nastiness that ldrd would silently
> > work on non-aligned addresses, but would lose the atomicity guarantee.
> > ldrexd would always fault if the address is not aligned.
> > We might want to check the alignment of data we access (assert?), if not
> > done already.
> > - It must be in system RAM, so not MMIO. Also I think TCM might be a
> > different story, but I would hope Xen would not use that directly.
> >  
> Many thanks for the nice explanation.
> 
> I am trying to compare this with the atomicity requirement for AArch64 
> (ARM DDI 0487G.b ID072021, B2.2.1 Requirements for single-copy atomicit )
> 
> I seethat the alignment requirement is the same as for ARMv8-R32.
> 
> "-A read that is generated by a load instruction that loads a single 
> general-purpose register and is aligned to the size of the read in the 
> instruction is single-copy atomic.
> 
> -A write that is generated by a store instruction that stores a single 
> general-purpose register and is aligned to the size of the write in the 
> instruction is single-copy atomic"
> 
> I think the following code change should help us to confirm the correct 
> behavior of atomic read/write on both AArch64 and AArch32 (including R52).
> 
> diff --git a/xen/arch/arm/include/asm/atomic.h 
> b/xen/arch/arm/include/asm/atomic.h
> index ac2798d095..f22c65a853 100644
> --- a/xen/arch/arm/include/asm/atomic.h
> +++ b/xen/arch/arm/include/asm/atomic.h
> @@ -78,6 +78,7 @@ static always_inline void read_atomic_size(const 
> volatile void *p,
>                                              void *res,
>                                              unsigned int size)
>   {
> +    ASSERT(IS_ALIGNED((unsigned long)p, size));
>       switch ( size )
>       {
>       case 1:
> @@ -102,6 +103,7 @@ static always_inline void write_atomic_size(volatile 
> void *p,
>                                               void *val,
>                                               unsigned int size)
>   {
> +    ASSERT(IS_ALIGNED((unsigned long)p, size));
>       switch ( size )
>       {
>       case 1:
> 
> Please let me know if I misunderstand something.

Yes, that looks correct. Even the more "simple" atomic accesses always
require alignment, so if you do an unaligned 32-bit read in AArch64, it
wouldn't be single-copy atomic either.

Cheers,
Andre

> >> If so, then is using ldrexd/strexd the solution for this ?  
> > As mentioned above, you would not need that, just
> > read_atomic()/write_atomic() would do.
> >
> > Hope that clears that up.
> >
> > Cheers,
> > Andre
> >
> > P.S. This above is my reading of the ARM ARM and the R32 supplement. I can
> > double check with the architects, but this might take a while.
> >  
> >> IIUC "Memory accesses caused by LDREXD and STREXD instructions to
> >> doubleword-aligned locations.", then the answer seems yes.
> >>
> >> - Ayan
> >>  
> >>> But it's only atomic if you are using ldrd/strd, which you cannot know for
> >>> sure in C, because it's up to the compiler to generate the instructions.
> >>>
> >>> This is why we have that test. Changing the unsigned long to uint64_t
> >>> would make the check pointless, since the data structure is 64-bits long,
> >>> so it would always be true.
> >>>
> >>> So given that you don't seem to need it, right now, it would leave the
> >>> test alone.
> >>>
> >>> If you need that on AArch32 anyway, you would need to replace accesses to
> >>> the host_lpis array with inline assembly accessors, to ensure ldrd/strd
> >>> instructions. This seems doable (there are only so many places which
> >>> directly access the array members), but would need a good use case.
> >>>
> >>> Cheers,
> >>> Andre
> >>>     
> >>>> Does this imply that atomicity will be retained (with the above change)
> >>>> ? Os will this require ldrexd/strexd as R52 supports MPU (not MMU, so
> >>>> translation tables are irrelevant).
> >>>> Kind regards,
> >>>> Ayan
> >>>>     




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.