[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v1] drivers: Change amd_iommu struct to contain pci_sbdf_t, simplify code
On Thu, 13 Mar 2025 at 20:16, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote: > > On 13/03/2025 6:57 pm, Andrii Sultanov wrote: > > Following on from 250d87dc, struct amd_iommu has its seg and bdf fields > > backwards with relation to pci_sbdf_t. Swap them around, and simplify the > > expressions regenerating an sbdf_t from seg+bdf. > > > > Simplify ioapic_sbdf and bpet_sbdf along the way. Adjust functions > > taking seg and bdf fields of these structs to take pci_sbdf_t instead. > > Simplify comparisons similarly. > > > > Bloat-o-meter reports: > > > > ``` > > add/remove: 0/0 grow/shrink: 13/21 up/down: 352/-576 (-224) > > Function old new delta > > _einittext 22028 22220 +192 > > amd_iommu_prepare 853 897 +44 > > _invalidate_all_devices 133 154 +21 > > amd_iommu_domain_destroy 46 63 +17 > > disable_fmt 12336 12352 +16 > > _acpi_module_name 416 432 +16 > > amd_iommu_get_reserved_device_memory 521 536 +15 > > parse_ivrs_table 3955 3966 +11 > > amd_iommu_assign_device 271 282 +11 > > find_iommu_for_device 242 246 +4 > > get_intremap_requestor_id 49 52 +3 > > amd_iommu_free_intremap_table 360 361 +1 > > allocate_domain_resources 82 83 +1 > > reassign_device 843 838 -5 > > amd_iommu_remove_device 352 347 -5 > > amd_iommu_flush_iotlb 359 354 -5 > > iov_supports_xt 270 264 -6 > > amd_iommu_setup_domain_device 1478 1472 -6 > > amd_setup_hpet_msi 232 224 -8 > > amd_iommu_ioapic_update_ire 572 564 -8 > > _hvm_dpci_msi_eoi 157 149 -8 > > amd_iommu_msi_enable 33 20 -13 > > register_range_for_device 297 281 -16 > > amd_iommu_add_device 856 839 -17 > > update_intremap_entry_from_msi_msg 879 861 -18 > > amd_iommu_read_ioapic_from_ire 347 323 -24 > > amd_iommu_msi_msg_update_ire 472 431 -41 > > flush_command_buffer 460 417 -43 > > set_iommu_interrupt_handler 421 377 -44 > > amd_iommu_detect_one_acpi 918 868 -50 > > amd_iommu_get_supported_ivhd_type 86 31 -55 > > iterate_ivrs_mappings 169 113 -56 > > parse_ivmd_block 1339 1271 -68 > > enable_iommu 1745 1665 -80 > > ``` > > > > Resolves: https://gitlab.com/xen-project/xen/-/issues/198 > > > > Reported-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > > Signed-off-by: Andrii Sultanov <sultanovandriy@xxxxxxxxx> > > Well, this is awkward. This is the task I'd put together for Cody to > try. I guess I have to find another one. My apologies! Just noticed it wasn't claimed on Gitlab, probably should have pinged you first. > In commit messages, we always want the subject alongside a hash. I have > this local alias to help: > > > xen.git/xen$ git commit-str 250d87dc > > commit 250d87dc3ff9 ("x86/msi: Change __msi_set_enable() to take > > pci_sbdf_t") > > xen.git/xen$ git help commit-str > > 'commit-str' is aliased to 'log -1 --abbrev=12 --pretty=format:'commit > > %h ("%s")'' > > (The name is not imaginative, and could probably be better.) Thanks, will amend. > > @@ -239,17 +239,17 @@ static int __init register_range_for_device( > > unsigned int bdf, paddr_t base, paddr_t limit, > > bool iw, bool ir, bool exclusion) > > { > > - int seg = 0; /* XXX */ > > - struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(seg); > > + pci_sbdf_t sbdf = { .seg = 0, .bdf = bdf }; > > The /* XXX */ wants to stay. It's highlighting that this code isn't > muti-segment aware (yet). Will do. > > + struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(sbdf.seg); > > struct amd_iommu *iommu; > > u16 req; > > int rc = 0; > > > > - iommu = find_iommu_for_device(seg, bdf); > > + iommu = find_iommu_for_device(sbdf); > > if ( !iommu ) > > { > > AMD_IOMMU_WARN("IVMD: no IOMMU for device %pp - ignoring > > constrain\n", > > - &PCI_SBDF(seg, bdf)); > > + &(sbdf)); > > The brackets should be dropped now. This should be just &sbdf. Will do > > @@ -1543,14 +1540,14 @@ static void invalidate_all_domain_pages(void) > > static int cf_check _invalidate_all_devices( > > u16 seg, struct ivrs_mappings *ivrs_mappings) > > { > > - unsigned int bdf; > > + pci_sbdf_t sbdf = { .seg = seg, .bdf = 0 }; > > u16 req_id; > > struct amd_iommu *iommu; > > > > - for ( bdf = 0; bdf < ivrs_bdf_entries; bdf++ ) > > + for ( /* sbdf.bdf = 0 */ ; sbdf.bdf < ivrs_bdf_entries; sbdf.bdf++ ) > > { > > - iommu = find_iommu_for_device(seg, bdf); > > - req_id = ivrs_mappings[bdf].dte_requestor_id; > > + iommu = find_iommu_for_device(sbdf); > > + req_id = ivrs_mappings[sbdf.bdf].dte_requestor_id; > > See how bloat-o-meter reports this as the 3rd most increased function. > This is an example where merging to a pci_sbdf_t local variable is > making things worse. > > Keep the bdf local variable, and use PCI_SBDF() for the call to > find_iommu_for_device(). > > The reason is that you're now modifying the low uint16_t half of a > uint32_t. This requires emitting 16-bit logic (requires the Operand > Size Override prefix, contributing to your code size), it also suffers > register merge penalty This particular example would be equivalent: add/remove: 0/0 grow/shrink: 2/2 up/down: 24/-24 (0) Function old new delta _invalidate_all_devices 138 154 +16 build_info 744 752 +8 __mon_lengths 2936 2928 -8 iterate_ivrs_mappings 129 113 -16 Will drop the hunk anyhow to reduce the size of diff. Thanks! > > diff --git a/xen/drivers/passthrough/amd/iommu_intr.c > > b/xen/drivers/passthrough/amd/iommu_intr.c > > index 9abdc38053..0c91125ec0 100644 > > --- a/xen/drivers/passthrough/amd/iommu_intr.c > > +++ b/xen/drivers/passthrough/amd/iommu_intr.c > > @@ -123,10 +123,10 @@ static spinlock_t* get_intremap_lock(int seg, int > > req_id) > > &shared_intremap_lock); > > } > > > > -static int get_intremap_requestor_id(int seg, int bdf) > > +static int get_intremap_requestor_id(pci_sbdf_t sbdf) > > { > > - ASSERT( bdf < ivrs_bdf_entries ); > > - return get_ivrs_mappings(seg)[bdf].dte_requestor_id; > > + ASSERT( sbdf.bdf < ivrs_bdf_entries ); > > + return get_ivrs_mappings(sbdf.seg)[sbdf.bdf].dte_requestor_id; > > This is also an example where merging the parameter is not necessarily > wise. The segment and bdf parts are used differently, so this function > now has to split the one parameter in two, and shift segment by 16 just > to use it. Will do: add/remove: 0/0 grow/shrink: 4/6 up/down: 32/-64 (-32) > > @@ -335,20 +336,19 @@ void cf_check amd_iommu_ioapic_update_ire( > > new_rte.raw = rte; > > > > /* get device id of ioapic devices */ > > - bdf = ioapic_sbdf[idx].bdf; > > - seg = ioapic_sbdf[idx].seg; > > - iommu = find_iommu_for_device(seg, bdf); > > + sbdf.sbdf = ioapic_sbdf[idx].sbdf.sbdf; > > sbdf = ioapic_sbdf[idx].sbdf; > > > + iommu = find_iommu_for_device(sbdf); > > if ( !iommu ) > > { > > AMD_IOMMU_WARN("failed to find IOMMU for IO-APIC @ %04x:%04x\n", > > - seg, bdf); > > + sbdf.seg, sbdf.bdf); > > This should be converted to %pp, which has a side effect of correcting > the rendering of bdf. > > > @@ -515,15 +515,15 @@ int cf_check amd_iommu_msi_msg_update_ire( > > struct msi_desc *msi_desc, struct msi_msg *msg) > > { > > struct pci_dev *pdev = msi_desc->dev; > > - int bdf, seg, rc; > > + pci_sbdf_t sbdf; > > + int rc; > > struct amd_iommu *iommu; > > unsigned int i, nr = 1; > > u32 data; > > > > - bdf = pdev ? pdev->sbdf.bdf : hpet_sbdf.bdf; > > - seg = pdev ? pdev->seg : hpet_sbdf.seg; > > + sbdf.sbdf = pdev ? pdev->sbdf.sbdf : hpet_sbdf.sbdf.sbdf; > > This is a better example where > > sbdf = pdev ? pdev->sbdf : hpet_sbdf.sbdf; > > is equivalent. > > > diff --git a/xen/drivers/passthrough/amd/iommu_map.c > > b/xen/drivers/passthrough/amd/iommu_map.c > > index dde393645a..17070904fa 100644 > > --- a/xen/drivers/passthrough/amd/iommu_map.c > > +++ b/xen/drivers/passthrough/amd/iommu_map.c > > @@ -694,17 +694,16 @@ int amd_iommu_reserve_domain_unity_unmap(struct > > domain *d, > > int cf_check amd_iommu_get_reserved_device_memory( > > iommu_grdm_t *func, void *ctxt) > > { > > - unsigned int seg = 0 /* XXX */, bdf; > > - const struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(seg); > > + pci_sbdf_t sbdf = {0}; > > "= {}" please. > > GCC has just introduced a nasty bug (they claim its a feature) where {0} > on unions now zeros less than it used to do. pci_sbdf_t doesn't tickle > this corner case, but we need to be proactive about removing examples of > "= {0}". Will amend with all of these. > > + const struct ivrs_mappings *ivrs_mappings = > > get_ivrs_mappings(sbdf.seg); > > /* At least for global entries, avoid reporting them multiple times. */ > > enum { pending, processing, done } global = pending; > > > > - for ( bdf = 0; bdf < ivrs_bdf_entries; ++bdf ) > > + for ( /* sbdf.bdf = 0 */ ; sbdf.bdf < ivrs_bdf_entries; ++sbdf.bdf ) > > { > > - pci_sbdf_t sbdf = PCI_SBDF(seg, bdf); > > - const struct ivrs_unity_map *um = ivrs_mappings[bdf].unity_map; > > - unsigned int req = ivrs_mappings[bdf].dte_requestor_id; > > - const struct amd_iommu *iommu = ivrs_mappings[bdf].iommu; > > + const struct ivrs_unity_map *um = > > ivrs_mappings[sbdf.bdf].unity_map; > > + unsigned int req = ivrs_mappings[sbdf.bdf].dte_requestor_id; > > + const struct amd_iommu *iommu = ivrs_mappings[sbdf.bdf].iommu; > > Again, this will be better staying as two split variables. Indeed (on top of the previous similar suggestion): add/remove: 0/0 grow/shrink: 4/9 up/down: 32/-128 (-96) > ~Andrew Thank you!
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |