|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 2/2] x86/iommu: avoid MSI address and data writes if IRT index hasn't changed
On 10.03.2025 16:42, Roger Pau Monné wrote:
> On Mon, Mar 10, 2025 at 11:51:09AM +0100, Jan Beulich wrote:
>> On 10.03.2025 10:55, Roger Pau Monne wrote:
>>> Attempt to reduce the MSI entry writes, and the associated checking whether
>>> memory decoding and MSI-X is enabled for the PCI device, when the MSI data
>>> hasn't changed.
>>>
>>> When using Interrupt Remapping the MSI entry will contain an index into
>>> the remapping table, and it's in such remapping table where the MSI vector
>>> and destination CPU is stored. As such, when using interrupt remapping,
>>> changes to the interrupt affinity shouldn't result in changes to the MSI
>>> entry, and the MSI entry update can be avoided.
>>>
>>> Signal from the IOMMU update_ire_from_msi hook whether the MSI data or
>>> address fields have changed, and thus need writing to the device registers.
>>> Such signaling is done by returning 1 from the function. Otherwise
>>> returning 0 means no update of the MSI fields, and thus no write
>>> required.
>>>
>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>
>> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
>> with two purely cosmetic suggestions and an only loosely related question
>> below.
>>
>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>> @@ -415,7 +415,9 @@ static int cf_check vmx_pi_update_irte(const struct
>>> vcpu *v,
>>>
>>> ASSERT_PDEV_LIST_IS_READ_LOCKED(msi_desc->dev->domain);
>>>
>>> - return iommu_update_ire_from_msi(msi_desc, &msi_desc->msg);
>>> + rc = iommu_update_ire_from_msi(msi_desc, &msi_desc->msg);
>>> +
>>> + return rc < 0 ? rc : 0;
>>
>> Only tangential here, but: Why does this function have a return type of
>> non-void, when neither caller cares?
>
> I'm afraid there's more wrong in vmx_pi_update_irte() that I've just
> spotted afterwards.
>
> vmx_pi_update_irte() passes to iommu_update_ire_from_msi() the
> msi_desc->msg field, but that field is supposed to always contain the
> non-translated MSI data, as you correctly pointed out in v1 it's
> consumed by dump_msi(). vmx_pi_update_irte() using msi_desc->msg to
> store the translated MSI effectively breaks dump_msi().
Oh, indeed - it violates what write_msi_msg() specifically checks by
an assertion.
> Also vmx_pi_update_irte() relies on the IRT index never changing, as
> otherwise it's missing any logic to update the MSI registers.
Isn't this a valid assumption here? msi_msg_to_remap_entry() will only
ever allocate a new index if none was previously allocated.
>>> --- a/xen/drivers/passthrough/amd/iommu_intr.c
>>> +++ b/xen/drivers/passthrough/amd/iommu_intr.c
>>> @@ -492,7 +492,7 @@ static int update_intremap_entry_from_msi_msg(
>>> get_ivrs_mappings(iommu->seg)[alias_id].intremap_table);
>>> }
>>>
>>> - return 0;
>>> + return !fresh ? 0 : 1;
>>> }
>>
>> Simply
>>
>> return fresh;
>>
>> ?
>>
>>> @@ -546,7 +546,7 @@ int cf_check amd_iommu_msi_msg_update_ire(
>>> rc = update_intremap_entry_from_msi_msg(iommu, bdf, nr,
>>> &msi_desc->remap_index,
>>> msg, &data);
>>> - if ( !rc )
>>> + if ( rc > 0 )
>>> {
>>> for ( i = 1; i < nr; ++i )
>>> msi_desc[i].remap_index = msi_desc->remap_index + i;
>>> --- a/xen/drivers/passthrough/vtd/intremap.c
>>> +++ b/xen/drivers/passthrough/vtd/intremap.c
>>> @@ -506,6 +506,7 @@ static int msi_msg_to_remap_entry(
>>> unsigned int index, i, nr = 1;
>>> unsigned long flags;
>>> const struct pi_desc *pi_desc = msi_desc->pi_desc;
>>> + bool alloc = false;
>>>
>>> if ( msi_desc->msi_attrib.type == PCI_CAP_ID_MSI )
>>> nr = msi_desc->msi.nvec;
>>> @@ -529,6 +530,7 @@ static int msi_msg_to_remap_entry(
>>> index = alloc_remap_entry(iommu, nr);
>>> for ( i = 0; i < nr; ++i )
>>> msi_desc[i].remap_index = index + i;
>>> + alloc = true;
>>> }
>>> else
>>> index = msi_desc->remap_index;
>>> @@ -601,7 +603,7 @@ static int msi_msg_to_remap_entry(
>>> unmap_vtd_domain_page(iremap_entries);
>>> spin_unlock_irqrestore(&iommu->intremap.lock, flags);
>>>
>>> - return 0;
>>> + return alloc ? 1 : 0;
>>> }
>>
>> Like above, simply
>>
>> return alloc;
>>
>> ?
>
> I wasn't sure whether this was overloading the boolean type and
> possibly breaking some MISRA rule. I can adjust.
What we are to do with Misra's essential type system is entirely unclear at
this point, aiui.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |