[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] x86/i8259: do not assume interrupts always target CPU0


  • To: xen-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • Date: Tue, 19 Sep 2023 13:06:37 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UX1eXRHpKe+Hj5+QRAop5/FRgFcMkLQbGSBj77udeCE=; b=LcuFoILpgw04FCKjOr4gKttSoHAO0yMSl8i57IHcx70NEkzh8q5crEjZFFBBXpejuyKBNlpRp0TIPEX5RUiXi93uBcS0XRTVrG94PD2JKgxMx+OwTaWnODdngeBDqMPpBi13mw5/GCy3aPWQn3mPzOkg+VsHmefLo+MowANc0pWnU7I+i5ZrtkhQPOEXUERZ6KqD9J91fzPEdAmag0jfBoNE3iolxzk1paDyJ4BLkt6netBVK9vvRkfvzyiohqM5ZDDT1b9QFQwPsjiT5Al+rU/7Gq0G9lHzw2zNzbh4SWLN8WSp+C2alCMTL9oZ5mdJUnX1BtDz10qVcBZQ7PTA4Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=B+zus5xlfgYeE4xa5K9TNRRjIkbMNjLEQW6REkWFU48+pYyrGsXHAol6cSrF3f4rMnUENqcqooPs85xgtBUSHzpxGMNf+HDs8IME5U3RSGSw72wnXfGbob/Mu9mIn9N94YaseWr7JhtbE4Tg2HbZz+CGtUi8+m/Lja4yyb0g5kCAdMdhdYB17e9Yt1cyJLCgAu8x8UWN+JH4eoV4Fg18L8jOWkzrPygsC7PQ9uFiTOc+oldkAykdQNF6E9ziyMckhlrPcWh+aF9eb477l5RceP6X8NmUSSsxLB4KFPKwsGh35XbnB1GXKPzCmR/axOZu85j/6IZJWjnrcA2bTwJ+bg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 19 Sep 2023 11:07:10 +0000
  • Ironport-data: A9a23:T7S4BqOsc2famT3vrR2DlsFynXyQoLVcMsEvi/4bfWQNrUol32dWm DQcDGvUO6uMZ2XwctwgOd/gpktUv5TUnIdkHgto+SlhQUwRpJueD7x1DKtS0wC6dZSfER09v 63yTvGacajYm1eF/k/F3oDJ9CQ6jefQAOOkVIYoAwgpLSd8UiAtlBl/rOAwh49skLCRDhiE/ Nj/uKUzAnf8s9JPGjxSs/jrRC9H5qyo42tJ5w1mP5ingXeF/5UrJMNHTU2OByOQrrl8RoaSW +vFxbelyWLVlz9F5gSNy+uTnuUiG9Y+DCDW4pZkc/HKbitq/0Te5p0TJvsEAXq7vh3S9zxHJ HehgrTrIeshFvWkdO3wyHC0GQkmVUFN0OevzXRSLaV/ZqAJGpfh66wGMa04AWEX0rpOLVBqr NtDEhoMNB7Et8/s6Y6EUtA506zPLOGzVG8ekldJ6GiASNwAEdXESaiM4sJE1jAtgMwIBezZe 8cSdTtoalLHfgFLPVAUTpk5mY9EhFGmK2Ee9A3T+PpxujaCpOBy+OGF3N79YNuFSN8Thk+Fj mnH4374ElcRM9n3JT+tqyj23bSQwHyiMG4UPKCD+Nlo2UWR/FEaCTgKaEeQpdCiuFHrDrqzL GRRoELCt5Ma5EGtC9XwQRC8iHqFpQIHHcpdFfUg7wOAwbaS5ByWbkAGRDNcbN0ttOctWCcnk FSOmrvBGjhHoLCTD3WH+d+pQSiaPCEUKSoIY38CRA5cut37+tht31TIU8ppF7OzgpvtAzbsz juWrS84wbIOkcoM0Kb99lfC696xmqX0oscOzl2/dgqYAslRPeZJu6TABYDn0Mt9
  • Ironport-hdrordr: A9a23:EB5NcaFDip4MTCM1pLqEEseALOsnbusQ8zAXPiBKJCC9vPb5qy nOpoV+6faQslwssR4b9uxoVJPvfZq+z+8R3WByB8bAYOCOggLBQL2KhbGI/9SKIVydygcy78 Zdm6gVMqyMMbB55/yKnDVRxbwbsaa6GKPDv5ah8590JzsaDJ2Jd21Ce32m+ksdfnghObMJUK Cyy+BgvDSadXEefq2AdwM4t7iqnayzqHr+CyR2fyIa1A==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Sporadically we have seen the following during AP bringup on AMD platforms
only:

microcode: CPU59 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
microcode: CPU60 updated from revision 0x830104d to 0x830107a, date = 2023-05-17
CPU60: No irq handler for vector 27 (IRQ -2147483648)
microcode: CPU61 updated from revision 0x830107a to 0x830107a, date = 2023-05-17

This is similar to the issue raised on Linux commit 36e9e1eab777e, where they
also observed i8259 (active) vectors getting delivered to CPUs different than
0.

Adjust the target CPU mask of i8259 interrupt descriptors to contain all
possible CPUs, so that APs will reserve the vector at startup if any legacy IRQ
is still delivered through the i8259.  Note that if the IO-APIC takes over
those interrupt descriptors the CPU mask will be reset.

Spurious i8259 interrupt vectors however (IRQ7 and IRQ15) can be injected even
when all i8259 pins are masked, and hence need to be handled on all CPUs.
Reserve such vectors in order to prevent dynamic interrupt sources from using
them.

Finally, handle spurious i8259 interrupts on all CPUs and adjust the printed
message to display the CPU where the spurious interrupt has been received, so
it looks like:

microcode: CPU1 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
cpu1: spurious 8259A interrupt: IRQ7
microcode: CPU2 updated from revision 0x830104d to 0x830107a, date = 2023-05-17

Fixes: 3fba06ba9f8b ('x86/IRQ: re-use legacy vector ranges on APs')
Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
---
One theory I have is that the APs at some point (before jumping into Xen code)
have the local APIC hardware-disabled, and hence are considered valid targets
by the i8259, but by the time the vector is fetched from the i8259 the
interrupt has either been masked, or already consumed by a different CPU.
---
 xen/arch/x86/i8259.c | 18 ++++++++++++++++--
 xen/arch/x86/irq.c   |  9 ++++++++-
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/i8259.c b/xen/arch/x86/i8259.c
index ed9f55abe51e..ad3bca9895d0 100644
--- a/xen/arch/x86/i8259.c
+++ b/xen/arch/x86/i8259.c
@@ -222,7 +222,8 @@ static bool _mask_and_ack_8259A_irq(unsigned int irq)
         is_real_irq = false;
         /* Report spurious IRQ, once per IRQ line. */
         if (!(spurious_irq_mask & irqmask)) {
-            printk("spurious 8259A interrupt: IRQ%d.\n", irq);
+            printk("cpu%u: spurious 8259A interrupt: IRQ%u\n",
+                   smp_processor_id(), irq);
             spurious_irq_mask |= irqmask;
         }
         /*
@@ -349,7 +350,20 @@ void __init init_IRQ(void)
             continue;
         desc->handler = &i8259A_irq_type;
         per_cpu(vector_irq, cpu)[LEGACY_VECTOR(irq)] = irq;
-        cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
+
+        /*
+         * The interrupt affinity logic never targets interrupts to offline
+         * CPUs, hence it's safe to use cpumask_all here.
+         *
+         * Legacy PIC interrupts are only targeted to CPU0, but depending on
+         * the platform they can be distributed to any online CPU in hardware.
+         * The kernel has no influence on that. So all active legacy vectors
+         * must be installed on all CPUs.
+         *
+         * IO-APIC will change the destination mask if/when taking ownership of
+         * the interrupt.
+         */
+        cpumask_copy(desc->arch.cpu_mask, &cpumask_all);
         desc->arch.vector = LEGACY_VECTOR(irq);
     }
     
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 6abfd8162120..2379fdda3a7e 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -466,6 +466,14 @@ int __init init_irq_data(void)
           vector++ )
         __set_bit(vector, used_vectors);
 
+    /*
+     * Mark i8259 spurious vectors as used to avoid (re)using them.  Otherwise
+     * it won't be possible to distinguish between device triggered interrupts
+     * or spurious i8259 ones.
+     */
+    __set_bit(LEGACY_VECTOR(7), used_vectors);
+    __set_bit(LEGACY_VECTOR(15), used_vectors);
+
     return 0;
 }
 
@@ -1920,7 +1928,6 @@ void do_IRQ(struct cpu_user_regs *regs)
                 kind = "";
             if ( !(vector >= FIRST_LEGACY_VECTOR &&
                    vector <= LAST_LEGACY_VECTOR &&
-                   !smp_processor_id() &&
                    bogus_8259A_irq(vector - FIRST_LEGACY_VECTOR)) )
             {
                 printk("CPU%u: No irq handler for vector %02x (IRQ %d%s)\n",
-- 
2.42.0




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.