Xen project Mailing List

Re: [Xen-devel] [xen-unstable test] 145393: regressions - FAIL

To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Mon, 20 Jan 2020 10:10:11 +0000

Authentication-results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=roger.pau@xxxxxxxxxx; spf=Pass smtp.mailfrom=roger.pau@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, osstest service owner <osstest-admin@xxxxxxxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>

Delivery-date: Mon, 20 Jan 2020 10:10:30 +0000

Ironport-sdr: yh0QcZbyVjzrjjXxUzQBAx8ZoxQ+4A4sgf5VgBirUM891Ae3cCeBN8H6ih0BsV4VnuwL/ee6Io 0pmlH7a2b70amt2c1hcIhw5X219GC3jHebW+X58Ggx8B7O4qwBjA3OoyYdUtcmn0MW28Y1PIw9 sZAmVffXEFzGtgb1FivmOWL8umIESCtVT+Ozs2no7as3Z8uxbdczFczzXRrAQ4ZirkmPYCeAfg 6RGRKTP8FBEMc/NepUoWncs492/dr4Erqw2Op+MlxJRBiFH2I/kgffWpOEWIqxmG/Sur9BpXta hKo=

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Sun, Jan 19, 2020 at 02:36:32AM +0000, Tian, Kevin wrote: > > From: Roger Pau Monné <roger.pau@xxxxxxxxxx> > > Sent: Tuesday, December 31, 2019 11:30 PM > > > > On Mon, Dec 30, 2019 at 08:19:23PM +0000, osstest service owner wrote: > > > flight 145393 xen-unstable real [real] > > > http://logs.test-lab.xenproject.org/osstest/logs/145393/ > > > > > > Regressions :-( > > > > > > Tests which did not succeed and are blocking, > > > including tests which could not be run: > > > test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail > > REGR. vs. 145025 > > > > While da9290639eb5d6ac did fix the vmlaunch error, now the L1 guest > > seems to loose interrupts: > > > > [ 412.127078] NETDEV WATCHDOG: eth0 (e1000): transmit queue 0 timed > > out > > [ 412.151837] ------------[ cut here ]------------ > > [ 412.164281] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:320 > > dev_watchdog+0x252/0x260 > > [ 412.185821] Modules linked in: xen_gntalloc ext4 mbcache jbd2 e1000 > > sym53c8xx > > [ 412.204399] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.150+ #1 > > [ 412.223988] Hardware name: Xen HVM domU, BIOS 4.14-unstable > > 12/30/2019 > > [ 412.241657] task: ffffffff82213480 task.stack: ffffffff82200000 > > [ 412.256979] RIP: e030:dev_watchdog+0x252/0x260 > > [ 412.268444] RSP: e02b:ffff88801fc03e90 EFLAGS: 00010286 > > [ 412.281727] RAX: 0000000000000039 RBX: 0000000000000000 RCX: > > 0000000000000000 > > [ 412.300097] RDX: ffff88801fc1de70 RSI: ffff88801fc16298 RDI: > > ffff88801fc16298 > > [ 412.318283] RBP: ffff888006c6e41c R08: 000000000001f066 R09: > > 000000000000023b > > [ 412.336540] R10: ffff88801fc1a3f0 R11: ffffffff8287d96d R12: > > ffff888006c6e000 > > [ 412.354643] R13: 0000000000000000 R14: ffff888006e3ac80 R15: > > 0000000000000001 > > [ 412.373034] FS: 00007fa05293ecc0(0000) GS:ffff88801fc00000(0000) > > knlGS:0000000000000000 > > [ 412.393367] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 412.408112] CR2: 00007fd80ff16000 CR3: 000000000ce78000 CR4: > > 0000000000040660 > > [ 412.426338] Call Trace: > > [ 412.432747] <IRQ> > > [ 412.438102] ? dev_deactivate_queue.constprop.33+0x50/0x50 > > [ 412.451896] call_timer_fn+0x2b/0x130 > > [ 412.464208] run_timer_softirq+0x3d8/0x4b0 > > [ 412.474598] ? handle_irq_event_percpu+0x3c/0x50 > > [ 412.486426] __do_softirq+0x116/0x2ce > > [ 412.495883] irq_exit+0xcd/0xe0 > > [ 412.503999] xen_evtchn_do_upcall+0x27/0x40 > > [ 412.514626] xen_do_hypervisor_callback+0x29/0x40 > > [ 412.526684] </IRQ> > > [ 412.532252] RIP: e030:xen_hypercall_sched_op+0xa/0x20 > > [ 412.545034] RSP: e02b:ffffffff82203ea0 EFLAGS: 00000246 > > [ 412.558347] RAX: 0000000000000000 RBX: ffffffff82213480 RCX: > > ffffffff810013aa > > [ 412.576390] RDX: ffffffff822483e8 RSI: deadbeefdeadf00d RDI: > > deadbeefdeadf00d > > [ 412.594580] RBP: 0000000000000000 R08: ffffffffffffffff R09: > > 0000000000000000 > > [ 412.612831] R10: ffffffff82203e30 R11: 0000000000000246 R12: > > ffffffff82213480 > > [ 412.630980] R13: 0000000000000000 R14: ffffffff82213480 R15: > > ffffffff82238e80 > > [ 412.649138] ? xen_hypercall_sched_op+0xa/0x20 > > [ 412.660671] ? xen_safe_halt+0xc/0x20 > > [ 412.670177] ? default_idle+0x23/0x110 > > [ 412.679862] ? do_idle+0x168/0x1f0 > > [ 412.688666] ? cpu_startup_entry+0x14/0x20 > > [ 412.699059] ? start_kernel+0x4c3/0x4cb > > [ 412.708807] ? xen_start_kernel+0x527/0x530 > > [ 412.720776] Code: cb e9 a0 fe ff ff 0f 0b 4c 89 e7 c6 05 00 d6 c6 00 01 > > e8 82 > > 89 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 30 fb 01 82 e8 44 e9 a6 ff <0f> > > 0b e9 > > 58 fe ff ff 0f 1f 80 00 00 00 00 41 57 41 56 41 55 41 > > [ 412.767900] ---[ end trace d9e35c3f725f4b57 ]--- > > [ 412.780193] e1000 0000:00:05.0 eth0: Reset adapter > > > > This only happens when L1 is using x2APIC and a guest has been > > launched (by L1). Prior to launching any guest L1 seems to be fully > > functional. I'm currently trying to figure out how/when that interrupt > > is lost, which I bet it's related to the merging of vmcs between L1 > > and L2 done in L0. > > > > As a workaround I could disable exposing x2APIC in CPUID when nested > > virtualization is enabled on Intel. > > > > any progress on this problem? Please let me know if I overlooked a more > recent mail. possibly it's useful to fully compare the APICv related setting > in vmcs02 and vmcs12. Alternatively, you may disable all APICv features > to see whether APICv is the main reason. Hello, Yes, found out what was causing the issue, patches are at: https://lists.xenproject.org/archives/html/xen-devel/2020-01/msg00437.html Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.