[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] xen/arm: smmuv3: Add cache maintenance for non-coherent SMMU queues


  • To: Dmytro Firsov <dmytro_firsov@xxxxxxxx>
  • From: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
  • Date: Thu, 4 Sep 2025 06:47:52 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 4.158.2.129) smtp.rcpttodomain=epam.com smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com])
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Ttvx8c6sf0c2nXvr9jShfNp5pS2b5B7KdBzqk3E+vYI=; b=euOwY3ZtQ0jH+rKVDMKeRZ8SQApN4iP9KhB6yfPxhIRb/ugmX+dilzBR6UM8Z6uWeM92byiZx5MmgCIlzYBkNkeYIs7beeGhs5WAOlYPsPhbmJldyD1zBOwxRrZ0SRwyDSUob4DGQ7jqnrEzGp4/UdgnklD2Ek88rbaPL4MjG80HKRU3NKQANVmN4YNqqsXZnrLEehc+E1RwG6v/qzdDRyHeOb/sIxYtT/mpZxCo5eEiGAM9FTJgqe0LMQ6Z4C+4UIn4aiiCo+kUUR7u97DiKJl5XptG2goOH0rlvDBGCHXlfVWVrwAxPh7LVqNpUHQIPpOk08EUtc39Wra82iOCqg==
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Ttvx8c6sf0c2nXvr9jShfNp5pS2b5B7KdBzqk3E+vYI=; b=CKshwpaMCwAZYovvvTTuOKowGzRuB3JeRCGGLayiNgj67X8vZ5sQ7o0UyvRw1Oq4CeibyUdpWlpVXl96DvHHWgcMA43rF/Dv9xKceYQV2yxenqlbXExB7hysPAUf9K1Z8NvLcsiH/hWJr8gVBY2w49/+kGaC5ek1F7oE0vvV21UjSbLCc+JmyniZlJtcf/VHDbIO7bRxW91DWQ+G3DvUVrtpayuymm32583w1nkICirrCpU/spFAGKpOVbkYA6RFqFsihEeWQ6Mv1llK6FGI7fpIBEnYcTzi/kI78yIJB7VF2jjulnpqafG6JMHI0UuwQwEnB0UI9dU65+DoafcFhw==
  • Arc-seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=FDWs1mmGYOVJph7JVuuHIl9VdsEUKHM15lC/1LsYwFjjadmHqCmlaAcXlS6+hnvJPxzeIa+UGDLdE4qb3kZK+BI4JWgN6wXkBt7jUYa0EbSBif1APxwJNeZjov/s3/dalfcfGlR+964njjhQPjkP+qUdfD5RgynJRbazZlNaJ09ppiFXwbVLD6txvfLiwINhg649Zh1m+tU189JJILSjutj3ki697Vv7HzBcF5Cn78QSgEEqOdtuBFzn5Z51gQXDzD/pgTxx1dF/Zfy+k6X4HBfh6x28kbIbrCDz2Eqil/qaRTKfrBXpoSAxS/Db19qDQo+JmJgRChzlTPfCFOwRXA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=oOV3S6xhry2GBTe/abxxYaDhA1CoM2uAYSi7q1ctWKFZnKQOVC0yc+sBductdEsgDpfCqCCy7D/mWO+GwOq/rvSyUAToBc+O1uFoJV1eKBnzOy4b7rHsyxE7nMwF5J0dMY4n/8g0nd7LTmZ+icAyielSu16vsBALX3MkrydB9xA9iBhb4Qc/TH3uu3P4zcmE+cPbm9lKArrHYg0Pj9APRNyW5pO/flcPzWgxCfvo1sladZeIQtlEDxBOVjDtaLYuFqLYU6xMV2sY1e2TfcKpJC0GyDddbPhWClCV37W5fKFVVDGWvd6TTZbTC/pRARQk6Q1o4nRV18sJb7Dto35Sww==
  • Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Rahul Singh <Rahul.Singh@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Julien Grall <jgrall@xxxxxxxxxx>, Mykola Kvach <Mykola_Kvach@xxxxxxxx>
  • Delivery-date: Thu, 04 Sep 2025 06:48:37 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Thread-index: AQHcHQIn+C0aN53q3kCiWfPloZq/2LSClZqA
  • Thread-topic: [PATCH v2] xen/arm: smmuv3: Add cache maintenance for non-coherent SMMU queues

Hi Dmytro,

> On 3 Sep 2025, at 20:40, Dmytro Firsov <dmytro_firsov@xxxxxxxx> wrote:
> 
> According to the Arm SMMUv3 spec (ARM IHI 0070), a system may have
> SMMU(s) that is/are non-coherent to the PE (processing element). In such
> cases, memory accesses from the PE should be either non-cached or be
> augmented with manual cache maintenance. SMMU cache coherency is reported
> by bit 4 (COHACC) of the SMMU_IDR0 register and is already present in the
> Xen driver. However, the current implementation is not aware of cache
> maintenance for memory that is shared between the PE and non-coherent
> SMMUs. It contains dmam_alloc_coherent() function, that is added during
> Linux driver porting. But it is actually a wrapper for _xzalloc(), that
> returns normal writeback memory (which is OK for coherent SMMUs).
> 
> During Xen bring-up on a system with non-coherent SMMUs, the driver did
> not work properly - the SMMU was not functional and halted initialization
> at the very beginning due to a timeout while waiting for CMD_SYNC
> completion:
> 
>  (XEN) SMMUv3: /soc/iommu@fa000000: CMD_SYNC timeout
>  (XEN) SMMUv3: /soc/iommu@fa000000: CMD_SYNC timeout
> 
> To properly handle such scenarios, add the non_coherent flag to the
> arm_smmu_queue struct. It is initialized using features reported by the
> SMMU HW and will be used for triggering cache clean/invalidate operations.
> This flag is not queue-specific (it is applicable to the whole SMMU), but
> adding it to arm_smmu_queue allows us to not change function signatures
> and simplify the patch (smmu->features, which contains the required flag,
> are not available in code parts that require cache maintenance).
> 
> Signed-off-by: Dmytro Firsov <dmytro_firsov@xxxxxxxx>
> Reviewed-by: Julien Grall <jgrall@xxxxxxxxxx>
> Tested-by: Mykola Kvach <mykola_kvach@xxxxxxxx>

Acked-by: Bertrand Marquis <bertrand.marquis@xxxxxxx>

Cheers
Bertrand

> ---
> v2:
> - changed comment for non_coherent struct member
> - added Julien's RB
> - added Mykola's TB
> ---
> xen/drivers/passthrough/arm/smmu-v3.c | 27 +++++++++++++++++++++++----
> xen/drivers/passthrough/arm/smmu-v3.h |  3 +++
> 2 files changed, 26 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/arm/smmu-v3.c 
> b/xen/drivers/passthrough/arm/smmu-v3.c
> index bca5866b35..c65c47c038 100644
> --- a/xen/drivers/passthrough/arm/smmu-v3.c
> +++ b/xen/drivers/passthrough/arm/smmu-v3.c
> @@ -341,10 +341,14 @@ static void queue_write(__le64 *dst, u64 *src, size_t 
> n_dwords)
> 
> static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent)
> {
> + __le64 *q_addr = Q_ENT(q, q->llq.prod);
> +
> if (queue_full(&q->llq))
> return -ENOSPC;
> 
> - queue_write(Q_ENT(q, q->llq.prod), ent, q->ent_dwords);
> + queue_write(q_addr, ent, q->ent_dwords);
> + if (q->non_coherent)
> + clean_dcache_va_range(q_addr, q->ent_dwords * sizeof(*q_addr));
> queue_inc_prod(&q->llq);
> queue_sync_prod_out(q);
> return 0;
> @@ -360,10 +364,15 @@ static void queue_read(u64 *dst, __le64 *src, size_t 
> n_dwords)
> 
> static int queue_remove_raw(struct arm_smmu_queue *q, u64 *ent)
> {
> + __le64 *q_addr = Q_ENT(q, q->llq.cons);
> +
> if (queue_empty(&q->llq))
> return -EAGAIN;
> 
> - queue_read(ent, Q_ENT(q, q->llq.cons), q->ent_dwords);
> + if (q->non_coherent)
> + invalidate_dcache_va_range(q_addr, q->ent_dwords * sizeof(*q_addr));
> +
> + queue_read(ent, q_addr, q->ent_dwords);
> queue_inc_cons(&q->llq);
> queue_sync_cons_out(q);
> return 0;
> @@ -458,6 +467,7 @@ static void arm_smmu_cmdq_skip_err(struct arm_smmu_device 
> *smmu)
> struct arm_smmu_queue *q = &smmu->cmdq.q;
> u32 cons = readl_relaxed(q->cons_reg);
> u32 idx = FIELD_GET(CMDQ_CONS_ERR, cons);
> + __le64 *q_addr = Q_ENT(q, cons);
> struct arm_smmu_cmdq_ent cmd_sync = {
> .opcode = CMDQ_OP_CMD_SYNC,
> };
> @@ -484,11 +494,14 @@ static void arm_smmu_cmdq_skip_err(struct 
> arm_smmu_device *smmu)
> break;
> }
> 
> + if (q->non_coherent)
> + invalidate_dcache_va_range(q_addr, q->ent_dwords * sizeof(*q_addr));
> +
> /*
> * We may have concurrent producers, so we need to be careful
> * not to touch any of the shadow cmdq state.
> */
> - queue_read(cmd, Q_ENT(q, cons), q->ent_dwords);
> + queue_read(cmd, q_addr, q->ent_dwords);
> dev_err(smmu->dev, "skipping command in error state:\n");
> for (i = 0; i < ARRAY_SIZE(cmd); ++i)
> dev_err(smmu->dev, "\t0x%016llx\n", (unsigned long long)cmd[i]);
> @@ -499,7 +512,10 @@ static void arm_smmu_cmdq_skip_err(struct 
> arm_smmu_device *smmu)
> return;
> }
> 
> - queue_write(Q_ENT(q, cons), cmd, q->ent_dwords);
> + queue_write(q_addr, cmd, q->ent_dwords);
> +
> + if (q->non_coherent)
> + clean_dcache_va_range(q_addr, q->ent_dwords * sizeof(*q_addr));
> }
> 
> static void arm_smmu_cmdq_insert_cmd(struct arm_smmu_device *smmu, u64 *cmd)
> @@ -1587,6 +1603,9 @@ static int arm_smmu_init_one_queue(struct 
> arm_smmu_device *smmu,
> q->q_base |= FIELD_PREP(Q_BASE_LOG2SIZE, q->llq.max_n_shift);
> 
> q->llq.prod = q->llq.cons = 0;
> +
> + q->non_coherent = !(smmu->features & ARM_SMMU_FEAT_COHERENCY);
> +
> return 0;
> }
> 
> diff --git a/xen/drivers/passthrough/arm/smmu-v3.h 
> b/xen/drivers/passthrough/arm/smmu-v3.h
> index f09048812c..ab07366294 100644
> --- a/xen/drivers/passthrough/arm/smmu-v3.h
> +++ b/xen/drivers/passthrough/arm/smmu-v3.h
> @@ -522,6 +522,9 @@ struct arm_smmu_queue {
> 
> u32 __iomem *prod_reg;
> u32 __iomem *cons_reg;
> +
> + /* Is the memory access coherent? */
> + bool non_coherent;
> };
> 
> struct arm_smmu_cmdq {
> -- 
> 2.50.1




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.