[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] nvmx deadlock with MSR bitmaps
On 12.03.2020 11:56, Roger Pau Monné wrote: > On Thu, Mar 12, 2020 at 09:59:48AM +0100, Jan Beulich wrote: >> On 11.03.2020 19:04, Andrew Cooper wrote: >>> Specifically, this is a switch from an HVM vcpu, to a PV vcpu, where the >>> mapcache code tries to access the per-domain mappings on the HVM monitor >>> table. It ends up trying to recursively acquire the mapcache lock while >>> trying to walk %cr2 to identify the source of the fault. >>> >>> For nvmx->msr_merged, this needs to either be a xenheap page, or a >>> globally mapped domheap page. I'll draft a patch in a moment. >>> >>> For map_domain_page(), is there anything we can rationally do to assert >>> that it isn't called in the middle of a context switch? This is the >>> kind of thing which needs to blow up reliably in a debug build. >> >> Well, it's not inherently unsafe to do, it's just that >> mapcache_current_vcpu() would need to avoid using current from >> context_switch()'s call to set_current() through to >> __context_switch()'s call to write_ptbase(). A possible >> detection (if we don't want to make the case work) would >> seem to be ASSERT(current == this_cpu(curr_vcpu)). But of course >> there's also this extra logic in mapcache_current_vcpu() to deal >> with a PV vCPU having a null v->arch.guest_table, which I'm once >> again struggling to see under what conditions it might happen. >> The Dom0 building case can't be meant with there being >> mapcache_override_current() on that path. I'm wondering if the >> comment there is misleading and it's really to cover the case >> where, coming from a PV vCPU, current was already set to the >> idle vCPU by context_switch() (which would have a null >> v->arch.guest_table) - I wouldn't call this "we are running a >> paravirtualised guest". But in such a case the logic here would >> simply be a (too) special case of what you're describing as the >> issue with nVMX. > > Looking at the code in context_switch and __context_switch would it be > possible to set current to the next vCPU after all the from hooks have > been called? > > Ie: set_current could be moved into __context_switch after the call to > pd->arch.ctxt_switch->from(p). > > I'm also not sure I understand the difference between context_switch > and __context_switch, and how are callers supposed to use them. The latter does the full switch, while the former only records that a switch is pending, and does the parts that can't be deferred. For this reason set_current() can't be moved to __context_switch() without applying extreme care, I'm afraid. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |