[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 5/8] pci/arm: Use iommu_add_dt_pci_device()


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Stewart Hildebrand <stewart.hildebrand@xxxxxxx>
  • Date: Fri, 12 May 2023 17:03:18 -0400
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=suse.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9VPlWYf129wYlKBXI6XeX6da7R6uiXrDylbJaC8/Pr8=; b=oK5x7vyZmY/PHJgtkNLjg5htKNymCCYAKamGK0a5HqMgYiGGB/HitRJLXcmrhOYrGQQITpBe1rSAkiBWx0bJy5SY9v5HrCP3Pf8unF6BCP9wlo7RxYv9SlOt9vKpDb7YOjGDzqwyvArIDfIgbNo2tZHhvMYoSDZgym43jz4ynvUQ6xDX3+nJj3H6B+agAIzAPLAR3vomcgbW1jlRamP50/h9yHh/OsHCsRXyBsLOR9QvCO/KJCSibd+49TpO7ZTntBpr3NE1lYav1vqOJj9P9ov2MFO7QYbyBW0vkLqypH/AmM6saKm4kio/RNOk9JrgoWqkqVLA2Qn/khUo4vRuSg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GIm/dIKuvGLy5M25MK8OtQa8p7bSDT/RAIQnHWKDy1v8C6H+m9yUEIyHBptGGcskVdfbs3lPe6UQ1KpR+ySRco7ZNlXGnufYnsSrvmlaJ81ri+EgbDUa7K5o0W5wIYcYSP9+gTHNDhgww1QD3AHbI/XPeUN6Rc9pzfE+tdH6GvuITr3XJjy2wwgTNm4auz6Qetzv6dRJoq83clCbzlIfC7NgYGlXJWw/mRyDJw/LZR2ULXtIk3ifwRfv9ap3z65Fjt4wk5VCuQaBGgW1LfOCwnBCvlhYIHAtQJQhFMBXdAR9/8/WsKjKVwgwhi5/9z7+0TNCPobmF3YrZ4nlbLG7Gg==
  • Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>, "Stefano Stabellini" <sstabellini@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 12 May 2023 21:03:53 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 5/12/23 03:25, Jan Beulich wrote:
> On 11.05.2023 21:16, Stewart Hildebrand wrote:
>> @@ -762,9 +767,20 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
>>              pdev->domain = NULL;
>>              goto out;
>>          }
>> +#ifdef CONFIG_HAS_DEVICE_TREE
>> +        ret = iommu_add_dt_pci_device(pdev);
>> +        if ( ret < 0 )
>> +        {
>> +            printk(XENLOG_ERR "pci-iommu translation failed: %d\n", ret);
>> +            goto out;
>> +        }
>> +#endif
>>          ret = iommu_add_device(pdev);
> 
> Hmm, am I misremembering that in the earlier patch you had #else to
> invoke the alternative behavior?

You are remembering correctly. v1 had an #else, v2 does not.

> Now you end up calling both functions;
> if that's indeed intended,

Yes, this is intentional.

> this may still want doing differently.
> Looking at the earlier patch introducing the function, I can't infer
> though whether that's intended: iommu_add_dt_pci_device() checks that
> the add_device hook is present, but then I didn't find any use of this
> hook. The revlog there suggests the check might be stale.

Ah, right, the ops->add_device check is stale in the other patch. Good catch, 
I'll remove it there.

> If indeed the function does only preparatory work, I don't see why it
> would need naming "iommu_..."; I'd rather consider pci_add_dt_device()
> then.

The function has now been reduced to reading SMMU configuration data from DT 
and mapping RID/BDF -> AXI stream ID. However, it is still SMMU related, and it 
is still invoking another iommu_ops hook function, dt_xlate (which is yet 
another AXI stream ID translation, separate from what is being discussed here). 
Does this justify keeping "iommu_..." in the name? I'm not convinced 
pci_add_dt_device() is a good name for it either (more on this below).

> Plus in such a case #ifdef-ary here probably wants avoiding by
> introducing a suitable no-op stub for the !HAS_DEVICE_TREE case. Then
> ...
> 
>>          if ( ret )
>>          {
>> +#ifdef CONFIG_HAS_DEVICE_TREE
>> +            iommu_fwspec_free(pci_to_dev(pdev));
>> +#endif
> 
> ... this (which I understand is doing the corresponding cleanup) then
> also wants wrapping in a suitably named tiny helper function.

Sure, I'm on board with eliminating/reducing the #ifdef-ary where possible. 
Will do.

> But yet further I'm then no longer convinced this is the right place
> for the addition. pci_add_device() is backing physdev hypercalls. It
> would seem to me that the function may want invoking yet one layer
> further up, or it may even want invoking from a brand new DT-specific
> physdev-op. This would then leave at least the x86-only paths (invoking
> pci_add_device() from outside of pci_physdev_op()) entirely alone.

Let's establish that pci_add_device()/iommu_add_device() are already inherently 
performing tasks related to setting up a PCI device to work with an IOMMU.

The preparatory work in question needs to happen after:

  pci_add_device()
    -> alloc_pdev()

since we need to know all the possible RIDs (including those for phantom 
functions), but before the add_device iommu hook:

  pci_add_device()
    -> iommu_add_device()
      -> iommu_call(hd->platform_ops, add_device, ...)


The preparatory work (i.e. mapping RID/BDF -> AXI stream ID) is inherently 
associated with setting up a PCI device to work with an ARM SMMU (but not any 
particular variant of the SMMU). The SMMU distinguishes what PCI 
device/function DMA traffic is associated with based on the derived AXI stream 
ID (sideband data), not the RID/BDF directly. See [1].

Moving the preparatory work one layer up would mean duplicating what 
alloc_pdev() is already doing to set up pdev->phantom_stride (which we need to 
figure out all RIDs for that particular device). Moving it down into the 
individual SMMU drivers (smmu_ops/platform_ops) would mean duplicating special 
phantom function handling in each SMMU driver, and further deviating from the 
Linux SMMU driver(s) they are based on.

It still feels to me like pci_add_device() (or iommu_add_device()) is the right 
place to perform the RID/BDF -> AXI stream ID mapping.

Since there's nothing inherently DT specific (or ACPI specific) about deriving 
sideband data from RID/BDF, let me propose a new name for the function (instead 
of iommu_add_dt_pci_device):

  iommu_derive_pci_device_sideband_IDs()


Now, as far as DT and ACPI co-existing goes, I admit I haven't tested with 
CONFIG_ACPI=y yet (there seems to be some issues when both CONFIG_ARM_SMMU_V3=y 
and CONFIG_ACPI=y are enabled, even in staging). But I do recognize that we 
need a way support both CONFIG_HAS_DEVICE_TREE=y and CONFIG_ACPI=y 
simultaneously. Let me think on that for a bit...

[1] 
https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/pci-iommu.txt



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.