[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 4/9] AMD/IOMMU: introduce 128-bit IRTE non-guest-APIC IRTE format
On 13/06/2019 14:23, Jan Beulich wrote: > This is in preparation of actually enabling x2APIC mode, which requires > this wider IRTE format to be used. > > A specific remark regarding the first hunk changing > amd_iommu_ioapic_update_ire(): This bypass was introduced for XSA-36, > i.e. by 94d4a1119d ("AMD,IOMMU: Clean up old entries in remapping > tables when creating new one"). Other code introduced by that change has > meanwhile disappeared or further changed, and I wonder if - rather than > adding an x2apic_enabled check to the conditional - the bypass couldn't > be deleted altogether. For now the goal is to affect the non-x2APIC > paths as little as possible. > > Take the liberty and use the new "fresh" flag to suppress an unneeded > flush in update_intremap_entry_from_ioapic(). What is the meaning of fresh? Wouldn't "needs_update" be a more descriptive name? > > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > --- > Note that AMD's doc says Lowest Priority ("Arbitrated" by their naming) > mode is unavailable in x2APIC mode, but they've confirmed this to be a > mistake on their part. > > --- a/xen/drivers/passthrough/amd/iommu_intr.c > +++ b/xen/drivers/passthrough/amd/iommu_intr.c > @@ -35,12 +35,34 @@ struct irte_basic { > unsigned int :8; > }; > > +struct irte_full { > + unsigned int remap_en:1; > + unsigned int sup_io_pf:1; > + unsigned int int_type:3; > + unsigned int rq_eoi:1; > + unsigned int dm:1; > + unsigned int guest_mode:1; /* MBZ */ /* MBZ - not implemented yet. */ Seeing as interrupt posting will be a minor tweak to this data structure, rather than implementing a new one. > + unsigned int dest_lo:24; > + unsigned int :32; > + unsigned int vector:8; > + unsigned int :24; > + unsigned int :24; > + unsigned int dest_hi:8; The manual says that we should prefer aligned 64bit access, so some raw_{lo,hi} fields here will allow... > @@ -136,7 +170,21 @@ static void free_intremap_entry(unsigned > { > union irte_ptr entry = get_intremap_entry(seg, bdf, offset); > > - *entry.basic = (struct irte_basic){}; > + switch ( irte_mode ) > + { > + case irte_basic: > + *entry.basic = (struct irte_basic){}; > + break; > + > + case irte_full: > + entry.full->remap_en = 0; > + wmb(); ... this to become entry._128->raw_lo = 0; smp_wmb(); entry._128->raw_hi = 0; The interrupt mapping table is allocated in WB memory and accessed coherently, so an sfence instruction isn't necessary. All that matters is that remap_en gets cleared first. > + *entry.full = (struct irte_full){}; > + break; > + > + default: > + ASSERT_UNREACHABLE(); > + } > > __clear_bit(offset, get_ivrs_mappings(seg)[bdf].intremap_inuse); > } > @@ -154,8 +202,38 @@ static void update_intremap_entry(union > .dest = dest, > .vector = vector, > }; > + struct irte_full full = { > + .remap_en = 1, > + .sup_io_pf = 0, > + .int_type = int_type, > + .rq_eoi = 0, > + .dm = dest_mode, > + .dest_lo = dest, > + .dest_hi = dest >> 24, > + .vector = vector, > + }; Looking at the resulting code after this patch, I think these structures should move into their respective case blocks, to help the compiler to avoid initialising both. > + > + switch ( irte_mode ) > + { > + __uint128_t ret; > + union { > + __uint128_t raw; > + struct irte_full full; > + } old; > + > + case irte_basic: > + *entry.basic = basic; > + break; > + > + case irte_full: > + old.full = *entry.full; > + ret = cmpxchg16b(entry.full, &old, &full); > + ASSERT(ret == old.raw); Similarly, this can be implemented with entry.full->remap_en = 0; smp_wmb(); entry._128->raw_hi = full.raw_hi; smp_wmb(); entry._128->raw_lo = full.raw_lo; which avoids using a locked operation. > + break; > > - *entry.basic = basic; > + default: > + ASSERT_UNREACHABLE(); > + } > } > > static inline int get_rte_index(const struct IO_APIC_route_entry *rte) > @@ -169,6 +247,11 @@ static inline void set_rte_index(struct > rte->delivery_mode = offset >> 8; > } > > +static inline unsigned int get_full_dest(const struct irte_full *entry) > +{ > + return entry->dest_lo | (entry->dest_hi << 24); Given your observation on my still-not-upstream-yet patch about GCC not honouring the type of bitfields, doesn't dest_hi need explicitly casting to unsigned int before the shift, to avoid it being performed as int? > @@ -280,6 +392,18 @@ int __init amd_iommu_setup_ioapic_remapp > dest_mode = rte.dest_mode; > dest = rte.dest.logical.logical_dest; > > + if ( iommu->ctrl.xt_en ) > + { > + /* > + * In x2APIC mode we have no way of discovering the high 24 > + * bits of the destination of an already enabled interrupt. Yes we do. We read the interrupt remapping table, which is where that information lives. Any firmware driver which is following the spec won't have let an IRTE get cached, then played with the table without appropriate flushing. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |