[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.21 03/10] x86/HPET: use single, global, low-priority vector for broadcast IRQ
On 16.10.2025 18:27, Roger Pau Monné wrote: > On Thu, Oct 16, 2025 at 09:32:04AM +0200, Jan Beulich wrote: >> @@ -307,15 +309,13 @@ static void cf_check hpet_msi_set_affini >> struct hpet_event_channel *ch = desc->action->dev_id; >> struct msi_msg msg = ch->msi.msg; >> >> - msg.dest32 = set_desc_affinity(desc, mask); >> - if ( msg.dest32 == BAD_APICID ) >> - return; >> + /* This really is only for dump_irqs(). */ >> + cpumask_copy(desc->arch.cpu_mask, mask); > > If you no longer call set_desc_affinity(), could you adjust the second > parameter of hpet_msi_set_affinity() to be unsigned int cpu instead of > a cpumask? Looks like I could, yes. But then we need to split the function, as it's also used as the .set_affinity hook. > And here just clear desc->arch.cpu_mask and set the passed CPU. Which would still better be a cpumask_copy(), just given cpumask_of(cpu) as input. >> - msg.data &= ~MSI_DATA_VECTOR_MASK; >> - msg.data |= MSI_DATA_VECTOR(desc->arch.vector); >> + msg.dest32 = cpu_mask_to_apicid(mask); > > And here you can just use cpu_physical_id(). Right. All of which (up to here; but see below) perhaps better a separate, follow-on cleanup change. >> msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK; >> msg.address_lo |= MSI_ADDR_DEST_ID(msg.dest32); >> - if ( msg.data != ch->msi.msg.data || msg.dest32 != ch->msi.msg.dest32 ) >> + if ( msg.dest32 != ch->msi.msg.dest32 ) >> hpet_msi_write(ch, &msg); > > A further note here, which ties to my comment on the previous patch > about loosing the interrupt during the masked window. If the vector > is the same across all CPUs, we no longer need to update the MSI data > field, just the address one, which can be done atomically. We also > have signaling from the IOMMU whether the MSI fields need writing. Hmm, yes, we can leverage that, as long as we're willing to make assumptions here about what exactly iommu_update_ire_from_msi() does: We'd then rely on not only the original (untranslated) msg->data not changing, but also the translated one. That looks to hold for both Intel and AMD, but it's still something we want to be sure we actually want to make the code dependent upon. (I'm intending to at least add an assertion to that effect.) > We can avoid the masking, and the possible drop of interrupts. Hmm, right. There's nothing wrong with the caller relying on the write being atomic now. (Really, continuing to use hpet_msi_write() wouldn't be a problem, as re-writing the low half of HPET_Tn_ROUTE() with the same value is going to be benign. Unless of course that write was the source of the extra IRQs I'm seeing.) Taking together with what you said further up, having set_channel_irq_affinity() no longer use hpet_msi_set_affinity() as it is to ... >> @@ -328,7 +328,7 @@ static hw_irq_controller hpet_msi_type = >> .shutdown = hpet_msi_shutdown, >> .enable = hpet_msi_unmask, >> .disable = hpet_msi_mask, >> - .ack = ack_nonmaskable_msi_irq, >> + .ack = irq_actor_none, >> .end = end_nonmaskable_irq, >> .set_affinity = hpet_msi_set_affinity, ... satisfy the use here would then probably be desirable right away. The little bit that's left of hpet_msi_set_affinity() would then be open-coded in set_channel_irq_affinity(). Getting rid of the masking would (hopefully) also get rid of the stray IRQs that I'm observing, assuming my guessing towards the reason there is correct. >> @@ -497,6 +503,7 @@ static void set_channel_irq_affinity(str >> spin_lock(&desc->lock); >> hpet_msi_mask(desc); >> hpet_msi_set_affinity(desc, cpumask_of(ch->cpu)); >> + per_cpu(vector_irq, ch->cpu)[HPET_BROADCAST_VECTOR] = ch->msi.irq; > > I would set the vector table ahead of setting the affinity, in case we > can drop the mask calls around this block of code. Isn't there a problematic window either way round? I can make the change, but I don't see that addressing anything. The new comparator value will be written later anyway, and interrupts up to that point aren't of any interest anyway. I.e. it doesn't matter which of the CPUs gets to handle them. > I also wonder, do you really need the bind_irq_vector() if you > manually set the affinity afterwards, and the vector table plus > desc->arch.cpu_mask are also set here? At the very least I'd then also need to open-code the setting of desc->arch.vector and desc->arch.used. Possibly also the setting of the bit in desc->arch.used_vectors. And strictly speaking also the trace_irq_mask() invocation. >> --- a/xen/arch/x86/include/asm/irq-vectors.h >> +++ b/xen/arch/x86/include/asm/irq-vectors.h >> @@ -18,6 +18,15 @@ >> /* IRQ0 (timer) is statically allocated but must be high priority. */ >> #define IRQ0_VECTOR 0xf0 >> >> +/* >> + * Low-priority (for now statically allocated) vectors, sharing entry >> + * points with exceptions in the 0x10 ... 0x1f range, as long as the >> + * respective exception has an error code. >> + */ >> +#define FIRST_LOPRIORITY_VECTOR 0x10 >> +#define HPET_BROADCAST_VECTOR X86_EXC_AC >> +#define LAST_LOPRIORITY_VECTOR 0x1f > > I wonder if it won't be clearer to simply reserve a vector if the HPET > is used, instead of hijacking the AC one. It's one vector less, but > arguably now that we unconditionally use physical destination mode our > pool of vectors has expanded considerably. Well, I'd really like to avoid consuming an otherwise usable vector, if at all possible (as per Andrew's FRED plans, that won't be possible there anymore then). >> --- a/xen/arch/x86/irq.c >> +++ b/xen/arch/x86/irq.c >> @@ -755,8 +755,9 @@ void setup_vector_irq(unsigned int cpu) >> if ( !irq_desc_initialized(desc) ) >> continue; >> vector = irq_to_vector(irq); >> - if ( vector >= FIRST_HIPRIORITY_VECTOR && >> - vector <= LAST_HIPRIORITY_VECTOR ) >> + if ( vector <= (vector >= FIRST_HIPRIORITY_VECTOR >> + ? LAST_HIPRIORITY_VECTOR >> + : LAST_LOPRIORITY_VECTOR) ) >> cpumask_set_cpu(cpu, desc->arch.cpu_mask); > > I think this is wrong. The low priority vector used by the HPET will > only target a single CPU at a time, and hence adding extra CPUs to > that mask as part of AP bringup is not correct. I'm not sure about "wrong". It's not strictly necessary for the HPET one, I expect, but it's generally what would be necessary. For the HPET one, hpet_msi_set_affinity() replaces the value anyway. (I can add a sentence to this effect to the description, if that helps.) Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |