Xen project Mailing List

Re: x86/HVM: Linux'es apic_pending_intr_clear() warns about stale IRR

To: Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>

Date: Mon, 31 Oct 2022 18:37:01 +0000

Accept-language: en-GB, en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rOHxKu4DsSSnf3ymWmgk+YRyj847/cxN+YG1jFezWzM=; b=B9krWQ5j0ua+uOfOJP86DGL6gfw9TM4L7llJCMAY3SrQ+VoqYkO4dbFJ3PpWh1/3CGJji5i/G31aVLE5fw84k2H2yqa1h6o0QdmL8AYytBv/Ny62zo5nVriZdz/d7zaCXo5IPr2/gT2NhcE95Vb9mdAXIJTyux9F+IgfkZpLwlKju9/jPkxD42UstGcINjQo4battboJ9TGcAWkM9a5m86eyi6sLr7C+l9cmSt75Xo1Fmno2IZwBE/LZpnMNaQvQNZ2kP/+yCld13q0ynBVe+VbLKikANPhDgn+yPDMlxoevIJ/r6ocCBavhAtehPveir+3EYBXs9XcjaBOpsoT5/Q==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Z40mwwSGFI/XP6xx6ruJ53zWtyhZc08cfWZw2BFqjLnytb6armDoVE0Uq680ncdU9Ul24lXAslYzdS8R0YKJ94CfgArXrHjv2/YuUazNM/Bi0+5XUwuZlHFobEKAII5yFzn0t8IPcXC9+qVKxD1TeLfW9smtkS/57BjVi0qRA3Td3Gc9r2pdZEhknZGIHDBP9p4fHGjm60Z8nJmNVZnsH8ngCAHgWWCzGyhOCcZb7ub8iQTISnGQqOju2Jmgd1tERCoRznKDgNEBTDalBm8nUInmQidDlxpfxG67y3x8edMc7vO6hTY+xzUy64qA43PEOG/4VpPrNmUfHR+zCRvOQg==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;

Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>

Delivery-date: Mon, 31 Oct 2022 18:37:26 +0000

Ironport-data: A9a23:lnbtwq0Nuh2rPg3zUPbD5c1wkn2cJEfYwER7XKvMYLTBsI5bpzMCy jcfDTiBbP6JYDD9edwgb4ni/R5XvJSHnNVlSAE/pC1hF35El5HIVI+TRqvS04F+DeWYFR46s J9OAjXkBJppJpMJjk71atANlVEliefTAOK5ULSfUsxIbVcMYD87jh5+kPIOjIdtgNyoayuAo tq3qMDEULOf82cc3lk8tuTS9XuDgNyo4GlC5wRlPKgR1LPjvyJ94Kw3dPnZw0TQGuG4LsbiL 87fwbew+H/u/htFIrtJRZ6iLyXm6paLVeS/oiI+t5qK23CulQRrukoPD9IOaF8/ttm8t4sZJ OOhF3CHYVxB0qXkwIzxWvTDes10FfUuFLTveRBTvSEPpqFvnrSFL/hGVSkL0YMkFulfDFBfq NZDbxQ0flPSi9O0yYv8EdtjmZF2RCXrFNt3VnBI6xj8VK9jareaBqLA6JlfwSs6gd1IEbDGf c0FZDFzbRPGJRpSJlMQD5F4l+Ct7pX9W2QA9BTJ+uxqvS6Kk1EZPLvFabI5fvSjQ8lPk1nej WXB52njWTkRNcCFyCrD+XWp7gPKtXOgCNJISuXnnhJsqEKL3UM4OCQVaVi6qOu/h1WyQ+t9K mVBr0LCqoB3riRHVOLVXRe1vXqFtR40QMdLHqsx7wTl4rrZ5UOVC3YJShZFacc6r4kmSDoyz FiLktj1Qzt1v9WopWm1876VqXa5PnETJGpbPCscF1Javp/kvZ05iQ/JQpB7Cqmpg9bpGDb2h TeXsCw5gLZVhskOv0mmwW36b/uXjsChZmYICs//BwpJMisRiFaZWrGV

Ironport-hdrordr: A9a23:IHpjFai2azTs5ATdu1mOSTXdnXBQX3l13DAbv31ZSRFFG/FwyP rCoB1L73XJYWgqM03IwerwQ5VpQRvnhP1ICRF4B8buYOCUghrTEGgE1/qv/9SAIVy1ygc578 tdmsdFebrN5DRB7PoSpTPIa+rIo+P3v5xA592uqUuFJDsCA84P0+46MHfjLqQcfnglOXNNLu v52iMxnUvERZ14VKSGL0hAe9KGi8zAlZrgbxJDLQUg8hOygTSh76O/OwSE3z8FOgk/gIsKwC zgqUjU96+ju/a0xlv3zGnI9albn9Pn159qGNGMsM4IMT/h4zzYJLiJGofy/wzdktvfrWrCo+ O85yvI+P4DrE85S1vF4ycFHTOQlgrGpUWSkGNwykGT3PARDAhKd/apw7gpPCcxonBQwu2Vms hwrh2knosSAhXakCvn4d/UExlsi0qvuHIn1fUelnpFTOIlGfZsRKEkjTRo+a07bVTHwZFiFP MrANDX5f5Qf1/fZ3fFvnN3yNjpWngoBB+JTkULp8TQilFt7TtE5lpdwNZakmYL9Zo7RZUB7+ PYMr5wnLULSsMNd6pyCOoIXMPyAG3QRhDHNn6UPD3cZek6EmOIr4Sy7KQ+5emsdpBNxJwumI 7ZWFcdrmI2c1KGM7z74HSKyGG5fIyQZ0Wf9igF3ekJhlTVfsuaDQSTDFYzjsCnv/ITRsXGRv fbAuMlP8Pe

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHY7UFE3ShgMplOBEqVLVIh9xv/964o1QWA

Thread-topic: x86/HVM: Linux'es apic_pending_intr_clear() warns about stale IRR

On 31/10/2022 15:55, Jan Beulich wrote: > Hello, > > quite likely this isn't new, but I've ended up noticing it only recently: > On an oldish system where I hand a HVM guest an SR-IOV NIC (not sure yet > whether that actually matters) all APs have that warning issued, with all > reported values zero except for the very first IRR one - that's 00080000. > Which is suspicious by itself, for naming vector 0x13, i.e. below 0x20 > and hence within CPU exception range. To be clear, these are the VM's APs ? > For one I wonder about their logic: The function is called after setting > TPR to 0x10, which prevents the handling of vectors below 0x20 (and in > particular their propagation from ISR to IRR, if my understanding of the > process is right and the convoluted and imo partly incomplete SDM > description hasn't confused me). Plus the function runs when IRQs are > still off, which is another reason why nothing would ever propagate from > IRR to ISR while the function performs it work. Yet a comment there says > > /* > * If the ISR map is not empty. ACK the APIC and run another round > * to verify whether a pending IRR has been unblocked and turned > * into a ISR. > */ > > suggesting IRR bits could "promote" to ISR ones. And this, to me, is the > only justification for warning about leftover IRR bits (whereas I > certainly agree that the logic should result in all clear ISR bits, and > hence warning when one is still set is appropriate). Both the SDM and APM are fairly clear that IRR only moves to ISR when the core accepts the interrupt. So I agree that nothing in IRR will actually move to ISR as described by the comment. > And then I got puzzled by our logic: vlapic_get_ppr() is called only by > vlapic_set_ppr(), vlapic_lowest_prio(), and vlapic_read_aligned(). Yet > in particular not by vlapic_has_pending_irq(). While it looks like we > don't really ignore TPR during delivery, this appears to be a strange > split approach: hvm_interrupt_blocked() checks TPR, whereas > vlapic_has_pending_irq() checks ISR. I wonder if subtle issues can't > result from that ... This is precisely why want the fine grain settings for APIC acceleration. I know for certain there's at least one bug here, because it still causes windows to explode on migrate. > Of course I'm yet to figure out how IRR bit 0x13 ends up being set in > the first place. 0x13 is a legal vector for incoming interrupts (for reasons of Windows using 0x1f for self-IPI.) Exception wise, it's #XF, which isn't very common. Xen could in principle have had an event delivery type error and tried to deliver an exception as an IRQ, and I don't think any of the safety assertions in hvm_inject_event() would have triggered in this case. I don't expect Linux will have deliberately IPI'd itself with that vector, but I suppose it's not impossible if it constructed an IPI from a badly initialised variable. Alternatively, that vector is in the PIC's default range, so we could have an emulation issue there. If I were you, I'd ensure the VM has 4 or fewer vCPUs, and set up %dr pointing at ISR[31:0] in the regs page. That will catch whatever happens to be writing into ISR, and the backtrace will probably be interesting. Just make sure you've disabled interrupt posting first, because that is the one source that will bypass the debugregs. ~Andrew

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.