[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 2/2] x86/iommu: avoid MSI address and data writes if IRT index hasn't changed
On 10.03.2025 16:42, Roger Pau Monné wrote: > On Mon, Mar 10, 2025 at 11:51:09AM +0100, Jan Beulich wrote: >> On 10.03.2025 10:55, Roger Pau Monne wrote: >>> Attempt to reduce the MSI entry writes, and the associated checking whether >>> memory decoding and MSI-X is enabled for the PCI device, when the MSI data >>> hasn't changed. >>> >>> When using Interrupt Remapping the MSI entry will contain an index into >>> the remapping table, and it's in such remapping table where the MSI vector >>> and destination CPU is stored. As such, when using interrupt remapping, >>> changes to the interrupt affinity shouldn't result in changes to the MSI >>> entry, and the MSI entry update can be avoided. >>> >>> Signal from the IOMMU update_ire_from_msi hook whether the MSI data or >>> address fields have changed, and thus need writing to the device registers. >>> Such signaling is done by returning 1 from the function. Otherwise >>> returning 0 means no update of the MSI fields, and thus no write >>> required. >>> >>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> >> >> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> >> with two purely cosmetic suggestions and an only loosely related question >> below. >> >>> --- a/xen/arch/x86/hvm/vmx/vmx.c >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c >>> @@ -415,7 +415,9 @@ static int cf_check vmx_pi_update_irte(const struct >>> vcpu *v, >>> >>> ASSERT_PDEV_LIST_IS_READ_LOCKED(msi_desc->dev->domain); >>> >>> - return iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); >>> + rc = iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); >>> + >>> + return rc < 0 ? rc : 0; >> >> Only tangential here, but: Why does this function have a return type of >> non-void, when neither caller cares? > > I'm afraid there's more wrong in vmx_pi_update_irte() that I've just > spotted afterwards. > > vmx_pi_update_irte() passes to iommu_update_ire_from_msi() the > msi_desc->msg field, but that field is supposed to always contain the > non-translated MSI data, as you correctly pointed out in v1 it's > consumed by dump_msi(). vmx_pi_update_irte() using msi_desc->msg to > store the translated MSI effectively breaks dump_msi(). Oh, indeed - it violates what write_msi_msg() specifically checks by an assertion. > Also vmx_pi_update_irte() relies on the IRT index never changing, as > otherwise it's missing any logic to update the MSI registers. Isn't this a valid assumption here? msi_msg_to_remap_entry() will only ever allocate a new index if none was previously allocated. >>> --- a/xen/drivers/passthrough/amd/iommu_intr.c >>> +++ b/xen/drivers/passthrough/amd/iommu_intr.c >>> @@ -492,7 +492,7 @@ static int update_intremap_entry_from_msi_msg( >>> get_ivrs_mappings(iommu->seg)[alias_id].intremap_table); >>> } >>> >>> - return 0; >>> + return !fresh ? 0 : 1; >>> } >> >> Simply >> >> return fresh; >> >> ? >> >>> @@ -546,7 +546,7 @@ int cf_check amd_iommu_msi_msg_update_ire( >>> rc = update_intremap_entry_from_msi_msg(iommu, bdf, nr, >>> &msi_desc->remap_index, >>> msg, &data); >>> - if ( !rc ) >>> + if ( rc > 0 ) >>> { >>> for ( i = 1; i < nr; ++i ) >>> msi_desc[i].remap_index = msi_desc->remap_index + i; >>> --- a/xen/drivers/passthrough/vtd/intremap.c >>> +++ b/xen/drivers/passthrough/vtd/intremap.c >>> @@ -506,6 +506,7 @@ static int msi_msg_to_remap_entry( >>> unsigned int index, i, nr = 1; >>> unsigned long flags; >>> const struct pi_desc *pi_desc = msi_desc->pi_desc; >>> + bool alloc = false; >>> >>> if ( msi_desc->msi_attrib.type == PCI_CAP_ID_MSI ) >>> nr = msi_desc->msi.nvec; >>> @@ -529,6 +530,7 @@ static int msi_msg_to_remap_entry( >>> index = alloc_remap_entry(iommu, nr); >>> for ( i = 0; i < nr; ++i ) >>> msi_desc[i].remap_index = index + i; >>> + alloc = true; >>> } >>> else >>> index = msi_desc->remap_index; >>> @@ -601,7 +603,7 @@ static int msi_msg_to_remap_entry( >>> unmap_vtd_domain_page(iremap_entries); >>> spin_unlock_irqrestore(&iommu->intremap.lock, flags); >>> >>> - return 0; >>> + return alloc ? 1 : 0; >>> } >> >> Like above, simply >> >> return alloc; >> >> ? > > I wasn't sure whether this was overloading the boolean type and > possibly breaking some MISRA rule. I can adjust. What we are to do with Misra's essential type system is entirely unclear at this point, aiui. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |