[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 10/10] AMD/IOMMU: restrict interrupt remapping table sizes



On 17.09.2019 17:30, Andrew Cooper wrote:
> On 06/08/2019 14:11, Jan Beulich wrote:
>> There's no point setting up tables with more space than a PCI device can
>> use. For both MSI and MSI-X we can determine how many interrupts could
>> be set up at most. Tables allocated during ACPI table parsing, however,
>> will (for now at least) continue to be set up to have maximum size.
>>
>> Note that until we would want to use sub-page allocations here there's
>> no point checking whether MSI is supported by a device - 1 or up to 32
>> (or actually 128, due to the change effectively using a reserved
>> encoding) IRTEs always mean an order-0 allocation anyway.
> 
> Devices which are not MSI-capable don't need an interrupt remapping
> table at all.

Oh, good point - pin based interrupts use the respective IO-APIC's
IRTE.

> Per my calculations, the Rome SDP has 62 devices with MSI/MSI-X support,
> and 98 devices which are CPU-internals that have no interrupt support at
> all.
> 
> In comparison, for a production Cascade Lake system I have to hand, the
> stats are 92 non-MSI devices and 18 MSI-capable devices (which isn't a
> valid direct comparison due to how VT-d's remapping tables work, but is
> a datapoint on "similar looking systems").
> 
> I'm happy to leave "no IRT's for non-capable devices" for future work,
> but at the very least, I don't think the commit message wants phrasing
> in exactly this way.

I think it would be better to correct this right away, before it
goes in. I don't think it'll be overly much of a change to add an
MSI capability check next to the MSI-X one.

>> --- a/xen/drivers/passthrough/amd/iommu_init.c
>> +++ b/xen/drivers/passthrough/amd/iommu_init.c
>> @@ -1315,11 +1317,8 @@ static int __init amd_iommu_setup_device
>>              }
>>  
>>              amd_iommu_set_intremap_table(
>> -                dte,
>> -                ivrs_mappings[bdf].intremap_table
>> -                ? virt_to_maddr(ivrs_mappings[bdf].intremap_table)
>> -                : 0,
>> -                iommu_intremap);
>> +                dte, ivrs_mappings[bdf].intremap_table,
>> +                ivrs_mappings[bdf].iommu, iommu_intremap);
> 
> Ah - half of this looks like it wants to be in patch 6, rather than here.

Hmm, which half? I don't see anything misplaced here. The signature
of amd_iommu_set_intremap_table) changes only in this patch, not in
patch 6.

>> @@ -80,17 +81,13 @@ unsigned int nr_ioapic_sbdf;
>>  
>>  static void dump_intremap_tables(unsigned char key);
>>  
>> -static unsigned int __init intremap_table_order(const struct
>> amd_iommu *iommu)
>> -{
>> -    return iommu->ctrl.ga_en
>> -           ? get_order_from_bytes(INTREMAP_MAX_ENTRIES * sizeof(union
>> irte128))
>> -           : get_order_from_bytes(INTREMAP_MAX_ENTRIES * sizeof(union
>> irte32));
>> -}
>> +#define intremap_page_order(irt) PFN_ORDER(virt_to_page(irt))
> 
> What makes the frameable order field safe to use?  It reaches into
> (pg)->v.free.order which fairly obviously isn't safe for allocated pages.

The same argument that allows xmalloc_whole_pages() and xfree() to
use this field: It is the owner of a page who controls how the
individual sub-fields of a union get used. As long as v.inuse and
v.sh don't get used, (ab)using v.free for an allocated page is
quite fine.

> virt_to_page() is a non-trivial calculation, which is now used in a
> large number of circumstances.  I don't have an easy judgement of
> whether they are hotpaths, but surely it would be easier to just store
> another unsigned int per device.

Except this would be a vendor specific field in a supposedly
vendor agnostic structure. I'm not very keen to add such a field.
Also I don't think interrupt setup/teardown paths would normally
be considered "hot".

What I could be talked into (to limit code size in case the
compiler doesn't expand things inline overly aggressively) is to
make this a static function instead of a macro.

> Furthermore, it would work around a preexisting issue where we can
> allocate beyond the number of interrupts for the device, up to the next
> order boundary.

Is this really an "issue", rather than just an irrelevant side
effect (which is never going to be hit as long as other layers
work correctly, in that they bound requests appropriately)?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.