Xen project Mailing List

Re: [PATCH v7 3/5] lib/arm: Add I/O memory copy helpers

From: Oleksii Moisieiev <Oleksii_Moisieiev@xxxxxxxx>

Date: Thu, 15 Jan 2026 15:42:25 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nBBnCnh2Fafd9vf3Iw4VPpFY5LszCz06xBd2Kkjfj5M=; b=dd/fle+8yY+gdUI13yJG1JUd0lmqUgjNal9HH5DLdbq3UZKXSLAoQXE4K5PRHdgOcAcVqDRPw/0lzzuhacrIlHpJ82i2hXAaMwnuEK3k7VGmhqudJdpd58ifHaL6VZBcpOzVphp0rKXmD8GycIG4VWHiHzK2VcwCW6Tewa53Y7+/e61u7WtMM/IpWQORhbTlpMw/WqJO7mwLYHs4Q40XRrTYElP255Qt2VuiN+Rkpx+z1+j21OEOWzk6/HaJvcfnyw+TLvF+VpjrCvjI41MKJxUvE1S9OxqRvBEtCgAGcQIawbvycqw2Ns+Kpo88Vqu9SCKSsU+3eKfUEvmYxssubA==

Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=PPjENYKW+cbYfY3HCgb6RHtxGx5dex45nPSaDr9hGO7xqUbaS7QwwiLUJ0e3VuEvH8dlkhEqAuNrnjBNHh+XqdX/RuxiI3DTNn95dtT1OIjpZx2nSEMDN+tYxd+8QHbHTvoPbAqw+CISDyL8P7zFB/DyV/qx5ivl89oFwSQtbGVrB+ZiNJMc7YJgN7ORh1stAXZFPAxXIwhmqFAOzUNpGOf2xMh4dhAuJy7m3fsFoEyTRv0ro4n/Y22MhOHUpJiWIjbGvKzm09cZCKkMqdqbkc9gRcf0IFWJkG12j+qOa84DBGrnriW3qtAaYxaA8hzeiOO4bzOPcAplDyk7U23r5Q==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=epam.com;

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Grygorii Strashko <grygorii_strashko@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Thu, 15 Jan 2026 15:42:35 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHchYPD+fwQGyuKXEKqeHI8xFGbqLVS9u+AgAAq8ACAADvNAIAAAX0AgAAA5QA=

Thread-topic: [PATCH v7 3/5] lib/arm: Add I/O memory copy helpers

On 15/01/2026 17:39, Jan Beulich wrote: > On 15.01.2026 16:34, Oleksii Moisieiev wrote: >> On 15/01/2026 13:59, Jan Beulich wrote: >>> On 15.01.2026 10:26, Jan Beulich wrote: >>>> On 14.01.2026 19:29, Oleksii Moisieiev wrote: >>>>> --- /dev/null >>>>> +++ b/xen/lib/arm/memcpy_fromio.c >>>>> @@ -0,0 +1,48 @@ >>>>> +/* SPDX-License-Identifier: GPL-2.0-only */ >>>>> +#include <asm/io.h> >>>>> +#include <xen/lib/io.h> >>>>> + >>>>> +/* >>>>> + * Use 32-bit raw IO operations for portability across ARM32/ARM64 where >>>>> + * 64-bit accessors may not be atomic and some devices only support >>>>> 32-bit >>>>> + * aligned accesses. >>>>> + */ >>>>> + >>>>> +void memcpy_fromio(void *to, const volatile void __iomem *from, >>>>> + size_t count) >>>>> +{ >>>>> + while ( count && (!IS_ALIGNED((unsigned long)from, 4) || >>>>> + !IS_ALIGNED((unsigned long)to, 4)) ) >>>> Nit: Xen style indentation (no hard tabs) please throughout. >>>> >>>>> + { >>>>> + *(uint8_t *)to = __raw_readb(from); >>>>> + from++; >>>>> + to++; >>>>> + count--; >>>>> + } >>>>> + >>>>> + while ( count >= 4 ) >>>>> + { >>>>> + *(uint32_t *)to = __raw_readl(from); >>>>> + from += 4; >>>>> + to += 4; >>>>> + count -= 4; >>>>> + } >>>>> + >>>>> + while ( count ) >>>>> + { >>>>> + *(uint8_t *)to = __raw_readb(from); >>>>> + from++; >>>>> + to++; >>>>> + count--; >>>>> + } >>>>> +} >>>> Barrier requirements on Arm aren't quite clear to me here: Is it really >>>> correct >>>> to use __raw_read{b,w,l}() here, rather than read{b,w,l}()? If it was, >>>> wouldn't >>>> a barrier then be needed at the end of the function? >>> Thinking about it, as the order of MMIO accesses needs to be guaranteed >>> (unless you have extra information about the area's properties, like it >>> being a video frame buffer), I'm pretty sure now that read{b,w,l}() needs >>> using here. In fact the comment in the header says that it would handle >>> "Memory ordering and barriers" when it doesn't look as if it did. >>> >>> Note how Linux looks to have grown multiple flavors: Besides >>> memcpy_fromio() I can also spot at least fb_memcpy_fromio() and >>> sdio_memcpy_fromio(). >>> >>>> And then, if it was read{b,w,l}() that is to be used here, what about all >>>> of >>>> this would then still be Arm-specific? Hmm, I guess the IS_ALIGNED() on >>>> "to" is, >>>> but that's Arm32-specific, with Arm64 not needing it? Plus then it's again >>>> not >>>> exactly Arm-specific, but specific to all architectures where misaligned >>>> accesses may fault. >>> There's a bigger issue here, with access granularity (despite the header >>> claiming "Implement alignment handling for devices requiring specific >>> access sizes"). MMIO can behave in interesting ways. The header comment >>> says nothing as to restrictions, i.e. when these functions may not be >>> used. Yet consider a device registers of which must be accessed in 32-bit >>> chunks. As long as the other pointer is suitably aligned, all would be >>> fine. But you handle the case where it isn't, and hence that case then >>> also needs to function correctly. At the same time accesses to a devices >>> requiring 16- or 64bit granularity wouldn't work at all, which for >>> required 8-bit granularity it would again only work partly. >>> >>> How much of the above requires code adjustments and how much should be >>> dealt with by updating commentary I don't know, as I know nothing about >>> your particular use case, nor about possible future ones. >>> >>> Also note that the header comment still references the ..._relaxed() >>> functions, when then implementation doesn't use those anymore. >>> >>> Jan >> Hi Jan, >> >> Thank you very much for your valuable input and involvement. >> >> After further research and reconsidering all the points you raised, I >> have decided >> to switch to using the ordered MMIO accessors (readb/readl and >> writeb/writel). >> >> Here is my reasoning: >> >> The concern you mentioned was valid: the use of __raw_read*/__raw_write* >> in the >> helpers did not provide the ordering guarantees expected for MMIO >> copies, and the >> header still referenced *_relaxed, even though the implementation no >> longer used >> them. This could allow reordering around the copy and misrepresent the >> intended >> semantics. >> >> A few additional thoughts regarding your other questions: >> >> Is it still Arm-specific? >> Functionally, the logic could work on any architecture that supports >> 8/32-bit MMIO >> accesses and uses the same accessor semantics. However, this implementation >> relies on Arm’s read*/write* ordering guarantees, as well as the >> Arm-specific >> IS_ALIGNED check to avoid misaligned faults. Therefore, it remains >> Arm-specific >> as written; other architectures would need their own implementations if they >> have different alignment or granularity requirements. >> >> Ordered accessors vs. raw accessors + trailing barrier: >> I chose to use ordered accessors instead of raw accessors with a >> trailing barrier >> because readb/readl and writeb/writel already provide the required >> device ordering >> and barrier semantics, making an additional barrier unnecessary and >> solving potential >> ordering issues. >> >> Use for register access: >> These helpers are suitable for MMIO buffers, shared memory, and >> registers that >> tolerate 8-/32-bit accesses. They are not appropriate for registers that >> require 16- or 64-bit accesses, or where side effects depend on the exact >> access width. I'll update the header to document this limitation; >> drivers needing strict >> register semantics should continue to use readl/writel >> (or readw/writew/readq/writeq) directly, rather than memcpy_*io(). >> >> Summary: >> >> I have made the following changes to the helper functions: >> >> - switched to ordered accessors to address the ordering and barrier >> concerns. >> - updated the documentation to match the implementation and explicitly state >> the supported access sizes and granularity. >> >> If this approach sounds good to you, I will proceed with the fix and >> submit a new version. > At the first glance it looks okay, but please allow Arm folks to give their > opinion. And of course my final take will be available only once I see the > next version. > > Jan Sure. I will post the new version once I sort out all documentation question. Thank you again for deep review! Oleksii.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.