[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH V4 3/3] x86/altp2m: fix display frozen when switching to a new view early
On 11/8/18 8:14 PM, George Dunlap wrote: > On 11/01/2018 02:45 PM, Razvan Cojocaru wrote: > ...here and... > >> + >> int p2m_set_ioreq_server(struct domain *d, >> unsigned int flags, >> struct hvm_ioreq_server *s) >> @@ -994,12 +1033,12 @@ int p2m_change_type_one(struct domain *d, unsigned >> long gfn_l, >> } >> >> /* Modify the p2m type of a range of gfns from ot to nt. */ >> -void p2m_change_type_range(struct domain *d, >> - unsigned long start, unsigned long end, >> - p2m_type_t ot, p2m_type_t nt) >> +static void change_type_range(struct p2m_domain *p2m, >> + unsigned long start, unsigned long end, >> + p2m_type_t ot, p2m_type_t nt) >> { >> unsigned long gfn = start; >> - struct p2m_domain *p2m = p2m_get_hostp2m(d); >> + struct domain *d = p2m->domain; >> int rc = 0; >> >> ASSERT(ot != nt); >> @@ -1052,6 +1091,24 @@ void p2m_change_type_range(struct domain *d, >> p2m_unlock(p2m); >> } >> >> +void p2m_change_type_range(struct domain *d, >> + unsigned long start, unsigned long end, >> + p2m_type_t ot, p2m_type_t nt) >> +{ >> + change_type_range(p2m_get_hostp2m(d), start, end, ot, nt); >> + >> +#ifdef CONFIG_HVM >> + if ( unlikely(altp2m_active(d)) ) >> + { >> + unsigned int i; >> + >> + for ( i = 0; i < MAX_ALTP2M; i++ ) >> + if ( d->arch.altp2m_eptp[i] != mfn_x(INVALID_MFN) ) >> + change_type_range(d->arch.altp2m_p2m[i], start, end, ot, >> nt); >> + } >> +#endif >> +} >> + > > ...here you grab & release each lock separately, inside the update > function. memory_type_changed is probably more or less idempotent, so > won't matter if two different calls race; but it seems likely that if > two p2m_change_type_range() calls happen concurrently, the various > altp2ms will get different results. Is it worth refactoring both of > these so that, like change_entry_type_global, you hold the host p2m lock > while you change the individual altp2m locks? I have changed the code to do all the modifications under hostp2m lock, and on changing the resolution on a host with three altp2ms I get a "Watchdog timer detects that CPU3 is stuck!" hypervisor crash: (XEN) stdvga.c:178:d1v0 leaving stdvga mode (XEN) 1 p2m_lock(hostp2m) (XEN) 1 change_type_range(hostp2m) (XEN) 1 p2m_lock(altp2m) (XEN) 1 change_type_range(altp2m) (XEN) 1 p2m_unlock(altp2m) (XEN) 1 p2m_lock(altp2m) (XEN) 1 change_type_range(altp2m) (XEN) 1 p2m_unlock(altp2m) (XEN) 1 p2m_lock(altp2m) (XEN) 1 change_type_range(altp2m) (XEN) 1 p2m_unlock(altp2m) (XEN) 1 p2m_unlock(hostp2m) (XEN) Watchdog timer detects that CPU3 is stuck! (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff82d080239107>] vcpu_sleep_sync+0x40/0x71 (XEN) RFLAGS: 0000000000000202 CONTEXT: hypervisor (d0v0) (XEN) rax: 0000000000000001 rbx: ffff8308dc1c3000 rcx: ffff8308dc1c3130 (XEN) rdx: 0000000000000000 rsi: 0000000000000292 rdi: ffff830c529f1010 (XEN) rbp: ffff830c52987c88 rsp: ffff830c52987c78 r8: ffff830ae7a1dfd0 (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000001c80 (XEN) r12: ffff82d0802390c7 r13: ffff8308d5222000 r14: ffff82c000225000 (XEN) r15: 00000000000f0000 cr0: 0000000080050033 cr4: 0000000000372660 (XEN) cr3: 0000000856c56000 cr2: 00007f8d36195000 (XEN) fsb: 00007f8d3fffe8c0 gsb: ffff880276c00000 gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen code around <ffff82d080239107> (vcpu_sleep_sync+0x40/0x71): (XEN) 01 00 00 00 74 24 f3 90 <8b> 11 48 8b 43 10 8b 80 cc 01 00 00 09 d0 48 98 (XEN) Xen stack trace from rsp=ffff830c52987c78: (XEN) 0000000000000240 ffff8308dc1c3000 ffff830c52987cb8 ffff82d0802070f1 (XEN) ffff830c52987cc8 ffff8308d5222000 0000000000000240 0000000000000048 (XEN) ffff830c52987cc8 ffff82d0802084bd ffff830c52987d38 ffff82d08036861e (XEN) 8c00000000000001 ffff8308d5222648 ffff830c52987d28 00000000000f0000 (XEN) 00007f8d400a7010 0000000000000048 0000000000856c23 0000000000000000 (XEN) ffff830c52987e00 ffffffffffffffea deadbeefdeadf00d ffff82d0802eebef (XEN) ffff830c52987de8 ffff82d0802ee217 01ff830c00000001 ffff82e011b4e6e0 (XEN) ffff830c52987d98 0000000000000000 ffff830c52971000 ffff830c52959000 (XEN) 0000000000000001 ffff830c52971000 ffff830c52987d98 0000000000000007 (XEN) 0000000000000240 00000000000f0000 ffff830c52987fff ffff8308d5222000 (XEN) ffff830c52987dc8 0000000000000002 0000000000000001 00007f8d400af010 (XEN) deadbeefdeadf00d ffff82d0802eebef ffff830c52987e48 ffff82d0802eec73 (XEN) ffff82d08037a3c4 0000000280370001 00007f8d400ae010 0000000000000020 (XEN) 00007f8d400a7010 0000000000000048 ffff82d08037a3c4 ffff830c52987ef8 (XEN) ffff830c52959000 0000000000000029 ffff830c52987ee8 ffff82d080373722 (XEN) 03ff82d08037a3c4 0000000000000001 0000000000000002 00007f8d400af010 (XEN) deadbeefdeadf00d deadbeefdeadf00d ffff82d08037a3c4 ffff82d08037a3b8 (XEN) ffff82d08037a3c4 ffff82d08037a3b8 ffff82d08037a3c4 ffff82d08037a3b8 (XEN) ffff82d08037a3c4 ffff830c52959000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 00007cf3ad6780e7 ffff82d08037a422 (XEN) Xen call trace: (XEN) [<ffff82d080239107>] vcpu_sleep_sync+0x40/0x71 (XEN) [<ffff82d0802070f1>] domain.c#do_domain_pause+0x33/0x4f (XEN) [<ffff82d0802084bd>] domain_pause+0x25/0x27 (XEN) [<ffff82d08036861e>] hap_track_dirty_vram+0x2b3/0x491 (XEN) [<ffff82d0802ee217>] dm.c#dm_op+0x472/0xd46 (XEN) [<ffff82d0802eec73>] do_dm_op+0x84/0xba (XEN) [<ffff82d080373722>] pv_hypercall+0x1af/0x4cd (XEN) [<ffff82d08037a422>] lstar_enter+0x112/0x120 (XEN) (XEN) CPU1 @ e008:ffff82d0802a853f (time.c#time_calibration_std_rendezvous+0x62/0x77) (XEN) CPU2 @ e008:ffff82d0802a853f (time.c#time_calibration_std_rendezvous+0x62/0x77) (XEN) CPU0 @ e008:ffff82d0802a8514 (time.c#time_calibration_std_rendezvous+0x37/0x77) (XEN) (XEN) **************************************** (XEN) Panic on CPU 3: (XEN) FATAL TRAP: vector = 2 (nmi) (XEN) [error_code=0000] (XEN) **************************************** AFAICT, it just takes longer than the timeout to do the sync. The code changes have been simple (patch attached), I don't think this is an actual deadlock but of course I could be wrong (at least printing out all of the lock / unlock calls in the new code looks like everything is being properly unlocked at the end). Any suggestion, as usual, appreciated. :) Thanks, Razvan Attachment:
change_altp2ms_under_hostp2m_lock.patch _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |