|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen/arm: smmuv1: remove iommu group when deassign a device
Hi ,
> On 28 Jun 2022, at 6:23 pm, Mykyta Poturai <mykyta.poturai@xxxxxxxxx> wrote:
>
>> Hi Mykyta,
>>
>>> On 21 Jun 2022, at 10:38 am, Mykyta Poturai <mykyta.poturai@xxxxxxxxx>
>>> wrote:
>>>
>>>> Thanks for testing the patch.
>>>>> But not fixed the "Unexpected Global fault" that occasionally happens
>>>>> when destroying
>>>>> the domain with an actively working GPU. Although, I am not sure if this
>>>>> issue
>>>>> is relevant here.
>>>>
>>>> Can you please if possible share the more details and logs so that I can
>>>> look if this issue is relevant here ?
>>>
>>> So in my setup I have a board with IMX8 chip and 2 core Vivante GPU. GPU is
>>> split between domains.
>>> One core goes to Dom0 and one to DomU.
>>>
>>> Steps to trigger this issue:
>>> 1. Start DomU
>>> 2. Start wayland and glmark2-es2-wayland inside DomU
>>> 3. Destroy DomU
>>>
>>> Sometimes it destroys fine but roughly 1 of 8 times I get logs like this:
>>>
>>> root@dom0:~# xl dest DomU
>>> [12725.412940] xenbr0: port 1(vif8.0) entered disabled state
>>> [12725.671033] xenbr0: port 1(vif8.0) entered disabled state
>>> [12725.689923] device vif8.0 left promiscuous mode
>>> [12725.696736] xenbr0: port 1(vif8.0) entered disabled state
>>> [12725.696989] audit: type=1700 audit(1616594240.068:39): dev=vif8.0 prom=0
>>> old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295
>>> (XEN) smmu: /iommu@51400000: Unexpected global fault, this could be serious
>>> (XEN) smmu: /iommu@51400000: GFSR 0x00000001, GFSYNR0 0x00000004,
>>> GFSYNR1 0x00001055, GFSYNR2 0x00000000
>>>
>>> My guess is that this happens because GPU continues to access memory when
>>> the context is already invalidated,
>>> and therefore triggers the "Invalid context fault".
>>
>> Yes you are right in this case GPU trying to do DMA operation after Xen
>> destroyed the guest and configures
>> the S2CR type value to fault. Solution to this issue is the patch that I
>> shared earlier.
>>
>> You can try this patch and confirm.This patch will solve both the issues.
>>
>> diff --git a/xen/drivers/passthrough/arm/smmu.c
>> b/xen/drivers/passthrough/arm/smmu.c
>> index 5cacb2dd99..ff1b73d3d8 100644
>> --- a/xen/drivers/passthrough/arm/smmu.c
>> +++ b/xen/drivers/passthrough/arm/smmu.c
>> @@ -1680,6 +1680,10 @@ static int arm_smmu_attach_dev(struct iommu_domain
>> *domain, struct device *dev)
>> if (!cfg)
>> return -ENODEV;
>>
>> + ret = arm_smmu_master_alloc_smes(dev);
>> + if (ret)
>> + return ret;
>> +
>> return arm_smmu_domain_add_master(smmu_domain, cfg);
>> }
>>
>> @@ -2075,7 +2079,7 @@ static int arm_smmu_add_device(struct device *dev)
>> iommu_group_add_device(group, dev);
>> iommu_group_put(group);
>>
>> - return arm_smmu_master_alloc_smes(dev);
>> + return 0;
>> }
>>
>>
>> Regards,
>> Rahul
>
> Hi Rahul,
>
> With this patch I get the same results, here is the error message:
>
> (XEN) smmu: /iommu@51400000: Unexpected global fault, this could be serious
> (XEN) smmu: /iommu@51400000: GFSR 0x00000001, GFSYNR0 0x00000004, GFSYNR1
> 0x00001055, GFSYNR2 0x00000000
>
As you mentioned earlier, this fault is observed because GPU continues to
access memory when the context is
already invalidated, and therefore triggers the "Invalid context fault". This
is a different issue and is not related to this patch.
@Julien are you okay if I will send the below patch for official review as this
issue pending for a long time?
In this patch we don’t need to call arm_smmu_master_free_smes() in
arm_smmu_detach_dev() but we will implement new
function arm_smmu_domain_remove_master() that will configure the s2cr value to
type fault in detach function.
arm_smmu_domain_remove_master() will revert the changes done by
arm_smmu_domain_add_master() in attach function.
diff --git a/xen/drivers/passthrough/arm/smmu.c
b/xen/drivers/passthrough/arm/smmu.c
index 69511683b4..da3adf8e7f 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -1598,21 +1598,6 @@ out_err:
return ret;
}
-static void arm_smmu_master_free_smes(struct arm_smmu_master_cfg *cfg)
-{
- struct arm_smmu_device *smmu = cfg->smmu;
- int i, idx;
- struct iommu_fwspec *fwspec = arm_smmu_get_fwspec(cfg);
-
- spin_lock(&smmu->stream_map_lock);
- for_each_cfg_sme(cfg, i, idx, fwspec->num_ids) {
- if (arm_smmu_free_sme(smmu, idx))
- arm_smmu_write_sme(smmu, idx);
- cfg->smendx[i] = INVALID_SMENDX;
- }
- spin_unlock(&smmu->stream_map_lock);
-}
-
static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
struct arm_smmu_master_cfg *cfg)
{
@@ -1635,6 +1620,20 @@ static int arm_smmu_domain_add_master(struct
arm_smmu_domain *smmu_domain,
return 0;
}
+static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_master_cfg *cfg)
+{
+ struct arm_smmu_device *smmu = smmu_domain->smmu;
+ struct arm_smmu_s2cr *s2cr = smmu->s2crs;
+ struct iommu_fwspec *fwspec = arm_smmu_get_fwspec(cfg);
+ int i, idx;
+
+ for_each_cfg_sme(cfg, i, idx, fwspec->num_ids) {
+ s2cr[idx] = s2cr_init_val;
+ arm_smmu_write_s2cr(smmu, idx);
+ }
+}
+
static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
{
int ret;
@@ -1684,10 +1683,11 @@ static int arm_smmu_attach_dev(struct iommu_domain
*domain, struct device *dev)
static void arm_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
{
+ struct arm_smmu_domain *smmu_domain = domain->priv;
struct arm_smmu_master_cfg *cfg = find_smmu_master_cfg(dev);
if (cfg)
- arm_smmu_master_free_smes(cfg);
+ return arm_smmu_domain_remove_master(smmu_domain, cfg);
}
Regards,
Rahul
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |