[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RFC] arm64: 32-bit tolerant sync bitops
On Tue, Apr 22, 2014 at 9:56 PM, Vladimir Murzin <murzin.v@xxxxxxxxx> wrote: > On Wed, Apr 23, 2014 at 09:31:29AM +0100, Ian Campbell wrote: >> On Thu, 2014-04-17 at 11:42 +0100, David Vrabel wrote: >> > On 17/04/14 09:38, Vladimir Murzin wrote: >> > > Xen assumes that bit operations are able to operate on 32-bit size and >> > > alignment [1]. For arm64 bitops are based on atomic exclusive load/store >> > > instructions to guarantee that changes are made atomically. However, >> > > these >> > > instructions require that address to be aligned to the data size. >> > > Because, by >> > > default, bitops operates on 64-bit size it implies that address should be >> > > aligned appropriately. All these lead to breakage of Xen assumption for >> > > bitops >> > > properties. >> > > >> > > With this patch 32-bit sized/aligned bitops is implemented. >> > > >> > > [1] http://www.gossamer-threads.com/lists/xen/devel/325613 >> > > >> > > Signed-off-by: Vladimir Murzin <murzin.v@xxxxxxxxx> >> > > --- >> > > Apart this patch other approaches were implemented: >> > > 1. turn bitops to be 32-bit size/align tolerant. >> > > the changes are minimal, but I'm not sure how broad side effect >> > > might be >> > > 2. separate 32-bit size/aligned operations. >> > > it exports new API, which might not be good >> > >> > I've never been particularly happy with the way the events_fifo.c uses >> > casts for the sync_*_bit() calls and I think we should do option 2. >> > >> > A generic implementation could be something like: >> >> It seems like there isn't currently much call for/interest in a generic >> 32-bit set of bitops. Since this is a regression on arm64 right now how >> about we simply fix it up in the Xen layer now and if in the future a >> common need is found we can rework to use it. >> >> We already have evtchn_fifo_{clear,set,is}_pending (although >> evtchn_fifo_unmask and consume_one_event need fixing to use is_pending) >> and evtchn_fifo_{test_and_set_,}mask, plus we would need >> evtchn_fifo_set_mask for consume_one_event to use instead of open >> coding. >> >> With that then those helpers can become something like: >> #define BM(w) (unsigned long *)(w & ~0x7) >> #define EVTCHN_FIFO_BIT(b, w) (BUG_ON(w&0x3), w & 0x4 ? >> EVTCHN_FIFO_#b + 32 : EVTCHN_FIFO_#b) >> static void evtchn_fifo_clear_pending(unsigned port) >> { >> event_word_t *word = event_word_from_port(port); >> sync_clear_bit(EVTCHN_FIFO_BIT(PENDING, word), BM(word)); >> } >> similarly for the others. >> >> If that is undesirable in the common code then wrap each existing helper >> in #ifndef evtchn_fifo_clean_pending and put something like the above in >> arch/arm64/include/asm/xen/events.h with a #define (and/or make the >> helpers themselves macros, or #define HAVE_XEN_FIFO_EVTCHN_ACCESSORS etc >> etc) >> >> We still need your patch from >> http://article.gmane.org/gmane.comp.emulators.xen.devel/196067 for the >> other unaligned bitmap case. Note that this also removes the use of BM >> for non-PENDING/MASKED bits, which is why it was safe for me to change >> it above (also it would be safe to & ~0x7 after that change anyway). >> >> Ian. >> > > > Might be dealing with it on Xen side is only way unless arm64 folks say what > they think... because reports related to unaligned access still appear I'll > put the minimally tested patch for making arm64 bitops 32-bits friendly ;) > > --- > arch/arm64/lib/bitops.S | 20 ++++++++++---------- > 1 file changed, 10 insertions(+), 10 deletions(-) > > diff --git a/arch/arm64/lib/bitops.S b/arch/arm64/lib/bitops.S > index 7dac371..7c3b058 100644 > --- a/arch/arm64/lib/bitops.S > +++ b/arch/arm64/lib/bitops.S > @@ -26,14 +26,14 @@ > */ > .macro bitop, name, instr > ENTRY( \name ) > - and w3, w0, #63 // Get bit offset > + and w3, w0, #31 // Get bit offset > eor w0, w0, w3 // Clear low bits > mov x2, #1 > - add x1, x1, x0, lsr #3 // Get word offset > + add x1, x1, x0, lsr #2 // Get word offset > lsl x3, x2, x3 // Create mask > -1: ldxr x2, [x1] > - \instr x2, x2, x3 > - stxr w0, x2, [x1] > +1: ldxr w2, [x1] > + \instr w2, w2, w3 > + stxr w0, w2, [x1] > cbnz w0, 1b > ret > ENDPROC(\name ) > @@ -41,15 +41,15 @@ ENDPROC(\name ) > > .macro testop, name, instr > ENTRY( \name ) > - and w3, w0, #63 // Get bit offset > + and w3, w0, #31 // Get bit offset > eor w0, w0, w3 // Clear low bits > mov x2, #1 > - add x1, x1, x0, lsr #3 // Get word offset > + add x1, x1, x0, lsr #2 // Get word offset > lsl x4, x2, x3 // Create mask > -1: ldxr x2, [x1] > +1: ldxr w2, [x1] > lsr x0, x2, x3 // Save old value of bit > - \instr x2, x2, x4 // toggle bit > - stlxr w5, x2, [x1] > + \instr w2, w2, w4 // toggle bit > + stlxr w5, w2, [x1] > cbnz w5, 1b > dmb ish > and x0, x0, #1 > -- > 1.8.3 > After revisiting arm32 implementation I think there is a bug in proposed changes.. :( The changes like: - add x1, x1, x0, lsr #3 // Get word offset + add x1, x1, x0, lsr #2 // Get word offset are not necessary. arm32 implementation do mov r0, r0, lsr #5 add r1, r1, r0, lsl #2 @ Get word offset for calculation of word offset what equivalent to r0, lsr #3 I'll send v2 if someone provide feedback for that ;) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |