[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen/arm: smmuv3: Add cache maintenance for non-coherent SMMU queues
Hi Michal, On 22.08.25 11:12, Orzel, Michal wrote:
First of all, thank you very much for review! I will consider usingOn 06/08/2025 16:58, Dmytro Firsov wrote:According to the Arm SMMUv3 spec (ARM IHI 0070), a system may have SMMU(s) that is/are non-coherent to the PE (processing element). In such cases, memory accesses from the PE should be either non-cached or be augmented with manual cache maintenance. SMMU cache coherency is reported by bit 4 (COHACC) of the SMMU_IDR0 register and is already present in the Xen driver. However, the current implementation is not aware of cache maintenance for memory that is shared between the PE and non-coherent SMMUs. It contains dmam_alloc_coherent() function, that is added during Linux driver porting. But it is actually a wrapper for _xzalloc(), that returns normal writeback memory (which is OK for coherent SMMUs). During Xen bring-up on a system with non-coherent SMMUs, the driver did not work properly - the SMMU was not functional and halted initialization at the very beginning due to a timeout while waiting for CMD_SYNC completion: (XEN) SMMUv3: /soc/iommu@fa000000: CMD_SYNC timeout (XEN) SMMUv3: /soc/iommu@fa000000: CMD_SYNC timeout To properly handle such scenarios, add the non_coherent flag to the arm_smmu_queue struct. It is initialized using features reported by the SMMU HW and will be used for triggering cache clean/invalidate operations. This flag is not queue-specific (it is applicable to the whole SMMU), but adding it to arm_smmu_queue allows us to not change function signatures and simplify the patch (smmu->features, which contains the required flag, are not available in code parts that require cache maintenance).There are already a few places advertising the SMMU coherency: 1) smmu->features 2) d->iommu->features 3) platform_features All of them are better places than queue struct (that as you pointed out is not specific to coherency). I'd suggest maybe to use 3) and removing ro_after_init if you don't have access to 1) and 2). All in all, providing yet another place for coherency flag seems a bit too much. "platform_features" in next patch version, it looks more appropriate and should be available within the whole driver. Also, I believe "ro_after_init" is also OK, since I have no need to change it (only check if cache maintenance should be performed). Yes, using "platform_features" for the coherency check willSigned-off-by: Dmytro Firsov <dmytro_firsov@xxxxxxxx> --- xen/drivers/passthrough/arm/smmu-v3.c | 27 +++++++++++++++++++++++---- xen/drivers/passthrough/arm/smmu-v3.h | 7 +++++++ 2 files changed, 30 insertions(+), 4 deletions(-) diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c index 5e9e3e048e..bf153227db 100644 --- a/xen/drivers/passthrough/arm/smmu-v3.c +++ b/xen/drivers/passthrough/arm/smmu-v3.c @@ -346,10 +346,14 @@ static void queue_write(__le64 *dst, u64 *src, size_t n_dwords) static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent) { + __le64 *q_addr = Q_ENT(q, q->llq.prod); + if (queue_full(&q->llq)) return -ENOSPC; - queue_write(Q_ENT(q, q->llq.prod), ent, q->ent_dwords); + queue_write(q_addr, ent, q->ent_dwords); + if (q->non_coherent) + clean_dcache_va_range(q_addr, q->ent_dwords * sizeof(*q_addr));I think it would be better to move the cache operation to queue_{write,read} to avoid having to repeat them at each occurence of the helpers. significantly reduce the patch size, I will rework this in V2. Unfortunately, we are currently experiencing difficulties with the test board on which we originally detected this issue. Thus, the V2 testing and publication may take some time. ~Michal
Best regards, Dmytro.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |