[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH v2 2/2] xen/arm: Add defensive barrier in get_cycles for Arm64
HI Julien, > -----Original Message----- > From: Julien Grall <julien@xxxxxxx> > Sent: 2021年1月8日 19:56 > To: Wei Chen <Wei.Chen@xxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx; > sstabellini@xxxxxxxxxx > Cc: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>; Penny Zheng > <Penny.Zheng@xxxxxxx>; Jiamei Xie <Jiamei.Xie@xxxxxxx>; nd > <nd@xxxxxxx> > Subject: Re: [PATCH v2 2/2] xen/arm: Add defensive barrier in get_cycles for > Arm64 > > Hi Wei, > > On 08/01/2021 06:21, Wei Chen wrote: > > Per the discussion [1] on the mailing list, we'd better to > > have a barrier after reading CNTPCT in get_cycles. If there > > is not any barrier there. When get_cycles being used in some > > seqlock critical context in the future, the seqlock can be > > speculated potentially. > > > > We import Linux commit 75a19a0202db21638a1c2b424afb867e1f9a2376: > > arm64: arch_timer: Ensure counter register reads occur with seqlock > > held > > > > When executing clock_gettime(), either in the vDSO or via a system > > call, > > we need to ensure that the read of the counter register occurs within > > the seqlock reader critical section. This ensures that updates to the > > clocksource parameters (e.g. the multiplier) are consistent with the > > counter value and therefore avoids the situation where time appears to > > go backwards across multiple reads. > > > > Extend the vDSO logic so that the seqlock critical section covers the > > read of the counter register as well as accesses to the data page. > > Since > > reads of the counter system registers are not ordered by memory barrier > > instructions, introduce dependency ordering from the counter read to a > > subsequent memory access so that the seqlock memory barriers apply to > > the counter access in both the vDSO and the system call paths. > > > > Cc: <stable@xxxxxxxxxxxxxxx> > > Cc: Marc Zyngier <marc.zyngier@xxxxxxx> > > Tested-by: Vincenzo Frascino <vincenzo.frascino@xxxxxxx> > > Link: https://lore.kernel.org/linux-arm- > kernel/alpine.DEB.2.21.1902081950260.1662@xxxxxxxxxxxxxxxxxxxxxxx/ > > Reported-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > > Signed-off-by: Will Deacon <will.deacon@xxxxxxx> > > > > While we are not aware of such use in Xen, it would be best to add the > > barrier to avoid any suprise. > > > > In order to reduce the impact of new barrier, we perfer to > > use enforce order instead of ISB [2]. > > > > Currently, enforce order is not applied to arm32 as this is > > not done in Linux at the date of this patch. If this is done > > in Linux it will need to be also done in Xen. > > > > [1] https://lists.xenproject.org/archives/html/xen-devel/2020- > 12/msg00181.html > > [2] https://lkml.org/lkml/2020/3/13/645 > > > > Signed-off-by: Wei Chen <wei.chen@xxxxxxx> > > --- > > v1 -> v2: > > 1. Update commit message to refer Linux commit. > > 2. Change u64 to uint64_t > > --- > > xen/include/asm-arm/time.h | 43 > ++++++++++++++++++++++++++++++++++++-- > > 1 file changed, 41 insertions(+), 2 deletions(-) > > > > diff --git a/xen/include/asm-arm/time.h b/xen/include/asm-arm/time.h > > index 5c4529ebb5..6b8fd839dd 100644 > > --- a/xen/include/asm-arm/time.h > > +++ b/xen/include/asm-arm/time.h > > @@ -11,9 +11,26 @@ > > > > typedef uint64_t cycles_t; > > > > -static inline cycles_t get_cycles(void) > > +/* > > + * Ensure that reads of the counter are treated the same as memory reads > > + * for the purposes of ordering by subsequent memory barriers. > > + */ > > +#if defined(CONFIG_ARM_64) > > +#define read_cntpct_enforce_ordering(val) do { \ > > + uint64_t tmp, _val = (val); \ > > + \ > > + asm volatile( \ > > + "eor %0, %1, %1\n" \ > > + "add %0, sp, %0\n" \ > > + "ldr xzr, [%0]" \ > > + : "=r" (tmp) : "r" (_val)); \ > > +} while (0) > > +#else > > +#define read_cntpct_enforce_ordering(val) do {} while (0) > > +#endif > > + > > +static inline cycles_t read_cntpct_stable(void) > > OOI, is there any particular reason to create a new helper? > Yes, I try to reduce the #if defined(CONFIG_ARM_64) chunks. I think if we introduce an empty helper for Arm32, we can reduce the other chunk inside get_cycles. In addition, if we need to do the same work for Arm32 in the future, we may not need to modify get_cycles. > > { > > - isb(); > > /* > > * ARM_WORKAROUND_858921: Cortex-A73 (all versions) counter read > > * can return a wrong value when the counter crosses a 32bit boundary. > > @@ -36,6 +53,28 @@ static inline cycles_t get_cycles(void) > > } > > } > > > > +static inline cycles_t get_cycles(void) > > +{ > > + cycles_t cnt; > > + > > + isb(); > > + cnt = read_cntpct_stable(); > > + > > + /* > > + * If there is not any barrier here. When get_cycles being used in > > + * some seqlock critical context in the future, the seqlock can be > > + * speculated potentially. > > + * > > + * To prevent seqlock from being speculated silently, we add a barrier > > + * here defensively. Normally, we just need an ISB here is enough, but > > + * considering the minimum performance cost. We prefer to use enforce > > + * order here. > > + */ > > I thought, we agreed to remove the comment. Did I miss anything? > > I can fix this one on commit if there is no need for a new revision. > Ah, sorry I forget that. If we don't need a new revision, please help me to remove it. Thanks. > Cheers, > > > + read_cntpct_enforce_ordering(cnt); > > + > > + return cnt; > > +} > > + > > /* List of timer's IRQ */ > > enum timer_ppi > > { > > > > -- > Julien Grall
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |