[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/arm: smmuv3: Add cache maintenance for non-coherent SMMU queues



Hi Michal,

On 26/08/2025 12:44, Orzel, Michal wrote:


On 26/08/2025 12:48, Julien Grall wrote:
Hi,

On 26/08/2025 10:47, Dmytro Firsov wrote:
On 22.08.25 11:12, Orzel, Michal wrote:
On 06/08/2025 16:58, Dmytro Firsov wrote:


According to the Arm SMMUv3 spec (ARM IHI 0070), a system may have
SMMU(s) that is/are non-coherent to the PE (processing element). In such
cases, memory accesses from the PE should be either non-cached or be
augmented with manual cache maintenance. SMMU cache coherency is reported
by bit 4 (COHACC) of the SMMU_IDR0 register and is already present in the
Xen driver. However, the current implementation is not aware of cache
maintenance for memory that is shared between the PE and non-coherent
SMMUs. It contains dmam_alloc_coherent() function, that is added during
Linux driver porting. But it is actually a wrapper for _xzalloc(), that
returns normal writeback memory (which is OK for coherent SMMUs).

During Xen bring-up on a system with non-coherent SMMUs, the driver did
not work properly - the SMMU was not functional and halted initialization
at the very beginning due to a timeout while waiting for CMD_SYNC
completion:

    (XEN) SMMUv3: /soc/iommu@fa000000: CMD_SYNC timeout
    (XEN) SMMUv3: /soc/iommu@fa000000: CMD_SYNC timeout

To properly handle such scenarios, add the non_coherent flag to the
arm_smmu_queue struct. It is initialized using features reported by the
SMMU HW and will be used for triggering cache clean/invalidate operations.
This flag is not queue-specific (it is applicable to the whole SMMU), but
adding it to arm_smmu_queue allows us to not change function signatures
and simplify the patch (smmu->features, which contains the required flag,
are not available in code parts that require cache maintenance).


There are already a few places advertising the SMMU coherency:
1) smmu->features
2) d->iommu->features
3) platform_features

All of them are better places than queue struct (that as you pointed out is not
specific to coherency). I'd suggest maybe to use 3) and removing ro_after_init
if you don't have access to 1) and 2). All in all, providing yet another place
for coherency flag seems a bit too much.
  > >
First of all, thank you very much for review! I will consider using
"platform_features" in next patch version, it looks more appropriate and
should be available within the whole driver. Also, I believe "ro_after_init"
is also OK, since I have no need to change it (only check if cache
maintenance should be performed).

I have to disagree with using "platform_features". Flushing the queue is
a per-SMMU decision. But looking at the code, I think passing the SMMU
to the caller would look wrong (what if you mistakenly pass the wrong
SMMU?). So I think a boolean per queue is the right appraoch.
For my own understanding: don't we consider SMMU coherency globally, not per
SMMU (hence why I suggested re-using the global flag)?
> We set the coherency> feature in the flags if all SMMUs support it. Do we want to diverge now and do
the flushing per SMMU?

The two contexts are quite different.

We need the global flag when cleaning the stage-2 page-tables because we don't always know at boot which SMMUs will be used for the devices attached (think PCI hotplug).

In the context of this patch, we know the queue is only used by a given SMMU. So i tis better to take this decision per-SMMU/queue to reduce the number of cache flush. This will be particularly important for the 2-stage SMMU work because there will be a lot of commands sent on behalf of the guest (every TLB flushes will be command).

Cheers,

--
Julien Grall




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.