[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [linux-linus test] 183794: regressions - FAIL


  • To: Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • From: Juergen Gross <jgross@xxxxxxxx>
  • Date: Thu, 23 Nov 2023 06:57:21 +0100
  • Authentication-results: smtp-out1.suse.de; none
  • Autocrypt: addr=jgross@xxxxxxxx; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNH0p1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmNvbT7CwHkEEwECACMFAlOMcK8CGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRCw3p3WKL8TL8eZB/9G0juS/kDY9LhEXseh mE9U+iA1VsLhgDqVbsOtZ/S14LRFHczNd/Lqkn7souCSoyWsBs3/wO+OjPvxf7m+Ef+sMtr0 G5lCWEWa9wa0IXx5HRPW/ScL+e4AVUbL7rurYMfwCzco+7TfjhMEOkC+va5gzi1KrErgNRHH kg3PhlnRY0Udyqx++UYkAsN4TQuEhNN32MvN0Np3WlBJOgKcuXpIElmMM5f1BBzJSKBkW0Jc Wy3h2Wy912vHKpPV/Xv7ZwVJ27v7KcuZcErtptDevAljxJtE7aJG6WiBzm+v9EswyWxwMCIO RoVBYuiocc51872tRGywc03xaQydB+9R7BHPzsBNBFOMcBYBCADLMfoA44MwGOB9YT1V4KCy vAfd7E0BTfaAurbG+Olacciz3yd09QOmejFZC6AnoykydyvTFLAWYcSCdISMr88COmmCbJzn sHAogjexXiif6ANUUlHpjxlHCCcELmZUzomNDnEOTxZFeWMTFF9Rf2k2F0Tl4E5kmsNGgtSa aMO0rNZoOEiD/7UfPP3dfh8JCQ1VtUUsQtT1sxos8Eb/HmriJhnaTZ7Hp3jtgTVkV0ybpgFg w6WMaRkrBh17mV0z2ajjmabB7SJxcouSkR0hcpNl4oM74d2/VqoW4BxxxOD1FcNCObCELfIS auZx+XT6s+CE7Qi/c44ibBMR7hyjdzWbABEBAAHCwF8EGAECAAkFAlOMcBYCGwwACgkQsN6d 1ii/Ey9D+Af/WFr3q+bg/8v5tCknCtn92d5lyYTBNt7xgWzDZX8G6/pngzKyWfedArllp0Pn fgIXtMNV+3t8Li1Tg843EXkP7+2+CQ98MB8XvvPLYAfW8nNDV85TyVgWlldNcgdv7nn1Sq8g HwB2BHdIAkYce3hEoDQXt/mKlgEGsLpzJcnLKimtPXQQy9TxUaLBe9PInPd+Ohix0XOlY+Uk QFEx50Ki3rSDl2Zt2tnkNYKUCvTJq7jvOlaPd6d/W0tZqpyy7KVay+K4aMobDsodB3dvEAs6 ScCnh03dDAFgIq5nsB11j3KPKdVoPlfucX2c7kGNH+LUMbzqV6beIENfNexkOfxHfw==
  • Cc: osstest service owner <osstest-admin@xxxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Julien Grall <julien@xxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, oleksandr_tyshchenko@xxxxxxxx
  • Delivery-date: Thu, 23 Nov 2023 05:57:41 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 23.11.23 00:07, Stefano Stabellini wrote:
On Wed, 22 Nov 2023, Juergen Gross wrote:
On 22.11.23 04:07, Stefano Stabellini wrote:
On Mon, 20 Nov 2023, Stefano Stabellini wrote:
On Mon, 20 Nov 2023, Juergen Gross wrote:
On 20.11.23 03:21, osstest service owner wrote:
flight 183794 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183794/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
    test-arm64-arm64-examine      8 reboot                   fail REGR.
vs.
183766

I'm seeing the following in the serial log:

Nov 20 00:25:41.586712 [    0.567318] kernel BUG at
arch/arm64/xen/../../arm/xen/enlighten.c:164!
Nov 20 00:25:41.598711 [    0.574002] Internal error: Oops - BUG:
00000000f2000800 [#1] PREEMPT SMP

The related source code lines in the kernel are:

········err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info,
xen_vcpu_nr(cpu),
································ &info);
········BUG_ON(err);

I suspect commit 20f3b8eafe0ba to be the culprit.

Stefano, could you please have a look?

The good news and bad news is that I cannot repro this neither with nor
without CONFIG_UNMAP_KERNEL_AT_EL0. I looked at commit 20f3b8eafe0ba but
I cannot see anything wrong with it. Looking at the register dump, from:

x0 : fffffffffffffffa

I am guessing the error was -ENXIO which is returned from map_guest_area
in Xen.

Could it be that the struct is crossing a page boundary? Or that it is
not 64-bit aligned? Do we need to do something like the following?

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 9afdc4c4a5dc..5326070c5dc0 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -484,7 +485,7 @@ static int __init xen_guest_init(void)
         * for secondary CPUs as they are brought up.
         * For uniformity we use VCPUOP_register_vcpu_info even on cpu0.
         */
-       xen_vcpu_info = alloc_percpu(struct vcpu_info);
+       xen_vcpu_info = __alloc_percpu(struct vcpu_info, PAGE_SIZE);
        if (xen_vcpu_info == NULL)
                return -ENOMEM;

May I suggest to use a smaller alignment? What about:

1 << fls(sizeof(struct vcpu_info) - 1)

See below

---
[PATCH] arm/xen: fix xen_vcpu_info allocation alignment

xen_vcpu_info is a percpu area than needs to be mapped by Xen.
Currently, it could cross a page boundary resulting in Xen being unable
to map it:

[    0.567318] kernel BUG at arch/arm64/xen/../../arm/xen/enlighten.c:164!
[    0.574002] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP

Fix the issue by using __alloc_percpu and requesting alignment for the
memory allocation.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxx>

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 9afdc4c4a5dc..09eb74a07dfc 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -484,7 +484,8 @@ static int __init xen_guest_init(void)
         * for secondary CPUs as they are brought up.
         * For uniformity we use VCPUOP_register_vcpu_info even on cpu0.
         */
-       xen_vcpu_info = alloc_percpu(struct vcpu_info);
+       xen_vcpu_info = __alloc_percpu(sizeof(struct vcpu_info),
+                                              1 << fls(sizeof(struct 
vcpu_info) - 1));

Nit: one tab less, please (can be fixed while committing).

        if (xen_vcpu_info == NULL)
                return -ENOMEM;

Reviewed-by: Juergen Gross <jgross@xxxxxxxx>


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.