[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 145393: regressions - FAIL



> From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> Sent: Tuesday, December 31, 2019 11:30 PM
> 
> On Mon, Dec 30, 2019 at 08:19:23PM +0000, osstest service owner wrote:
> > flight 145393 xen-unstable real [real]
> > http://logs.test-lab.xenproject.org/osstest/logs/145393/
> >
> > Regressions :-(
> >
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> >  test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail
> REGR. vs. 145025
> 
> While da9290639eb5d6ac did fix the vmlaunch error, now the L1 guest
> seems to loose interrupts:
> 
> [  412.127078] NETDEV WATCHDOG: eth0 (e1000): transmit queue 0 timed
> out
> [  412.151837] ------------[ cut here ]------------
> [  412.164281] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:320
> dev_watchdog+0x252/0x260
> [  412.185821] Modules linked in: xen_gntalloc ext4 mbcache jbd2 e1000
> sym53c8xx
> [  412.204399] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.150+ #1
> [  412.223988] Hardware name: Xen HVM domU, BIOS 4.14-unstable
> 12/30/2019
> [  412.241657] task: ffffffff82213480 task.stack: ffffffff82200000
> [  412.256979] RIP: e030:dev_watchdog+0x252/0x260
> [  412.268444] RSP: e02b:ffff88801fc03e90 EFLAGS: 00010286
> [  412.281727] RAX: 0000000000000039 RBX: 0000000000000000 RCX:
> 0000000000000000
> [  412.300097] RDX: ffff88801fc1de70 RSI: ffff88801fc16298 RDI:
> ffff88801fc16298
> [  412.318283] RBP: ffff888006c6e41c R08: 000000000001f066 R09:
> 000000000000023b
> [  412.336540] R10: ffff88801fc1a3f0 R11: ffffffff8287d96d R12:
> ffff888006c6e000
> [  412.354643] R13: 0000000000000000 R14: ffff888006e3ac80 R15:
> 0000000000000001
> [  412.373034] FS:  00007fa05293ecc0(0000) GS:ffff88801fc00000(0000)
> knlGS:0000000000000000
> [  412.393367] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  412.408112] CR2: 00007fd80ff16000 CR3: 000000000ce78000 CR4:
> 0000000000040660
> [  412.426338] Call Trace:
> [  412.432747]  <IRQ>
> [  412.438102]  ? dev_deactivate_queue.constprop.33+0x50/0x50
> [  412.451896]  call_timer_fn+0x2b/0x130
> [  412.464208]  run_timer_softirq+0x3d8/0x4b0
> [  412.474598]  ? handle_irq_event_percpu+0x3c/0x50
> [  412.486426]  __do_softirq+0x116/0x2ce
> [  412.495883]  irq_exit+0xcd/0xe0
> [  412.503999]  xen_evtchn_do_upcall+0x27/0x40
> [  412.514626]  xen_do_hypervisor_callback+0x29/0x40
> [  412.526684]  </IRQ>
> [  412.532252] RIP: e030:xen_hypercall_sched_op+0xa/0x20
> [  412.545034] RSP: e02b:ffffffff82203ea0 EFLAGS: 00000246
> [  412.558347] RAX: 0000000000000000 RBX: ffffffff82213480 RCX:
> ffffffff810013aa
> [  412.576390] RDX: ffffffff822483e8 RSI: deadbeefdeadf00d RDI:
> deadbeefdeadf00d
> [  412.594580] RBP: 0000000000000000 R08: ffffffffffffffff R09:
> 0000000000000000
> [  412.612831] R10: ffffffff82203e30 R11: 0000000000000246 R12:
> ffffffff82213480
> [  412.630980] R13: 0000000000000000 R14: ffffffff82213480 R15:
> ffffffff82238e80
> [  412.649138]  ? xen_hypercall_sched_op+0xa/0x20
> [  412.660671]  ? xen_safe_halt+0xc/0x20
> [  412.670177]  ? default_idle+0x23/0x110
> [  412.679862]  ? do_idle+0x168/0x1f0
> [  412.688666]  ? cpu_startup_entry+0x14/0x20
> [  412.699059]  ? start_kernel+0x4c3/0x4cb
> [  412.708807]  ? xen_start_kernel+0x527/0x530
> [  412.720776] Code: cb e9 a0 fe ff ff 0f 0b 4c 89 e7 c6 05 00 d6 c6 00 01 e8 
> 82
> 89 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 30 fb 01 82 e8 44 e9 a6 ff <0f> 0b 
> e9
> 58 fe ff ff 0f 1f 80 00 00 00 00 41 57 41 56 41 55 41
> [  412.767900] ---[ end trace d9e35c3f725f4b57 ]---
> [  412.780193] e1000 0000:00:05.0 eth0: Reset adapter
> 
> This only happens when L1 is using x2APIC and a guest has been
> launched (by L1). Prior to launching any guest L1 seems to be fully
> functional. I'm currently trying to figure out how/when that interrupt
> is lost, which I bet it's related to the merging of vmcs between L1
> and L2 done in L0.
> 
> As a workaround I could disable exposing x2APIC in CPUID when nested
> virtualization is enabled on Intel.
> 

any progress on this problem? Please let me know if I overlooked a more
recent mail. possibly it's useful to fully compare the APICv related setting
in vmcs02 and vmcs12. Alternatively, you may disable all APICv features
to see whether APICv is the main reason.

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.