[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option
On Tue, Jul 28, 2015 at 8:01 PM, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> wrote: > On 07/28/2015 08:47 PM, Andrew Cooper wrote: >> >> On 29/07/2015 01:21, Andy Lutomirski wrote: >>> >>> On Tue, Jul 28, 2015 at 10:10 AM, Boris Ostrovsky >>> <boris.ostrovsky@xxxxxxxxxx> wrote: >>>> >>>> On 07/28/2015 01:07 PM, Andy Lutomirski wrote: >>>>> >>>>> On Tue, Jul 28, 2015 at 9:30 AM, Andrew Cooper >>>>> <andrew.cooper3@xxxxxxxxxx> wrote: >>>>>> >>>>>> I suspect that the set_ldt(NULL, 0) call hasn't reached Xen before >>>>>> xen_free_ldt() is attempting to nab back the pages which Xen still has >>>>>> mapped as an LDT. >>>>>> >>>>> I just instrumented it with yet more LSL instructions. I'm pretty >>>>> sure that set_ldt really is clearing at least LDT entry zero. >>>>> Nonetheless the free_ldt call still oopses. >>>>> >>>> Yes, I added some instrumentation to the hypervisor and we definitely >>>> set >>>> LDT to NULL before failing. >>>> >>>> -boris >>> >>> Looking at map_ldt_shadow_page: what keeps shadow_ldt_mapcnt from >>> getting incremented once on each CPU at the same time if both CPUs >>> fault in the same shadow LDT page at the same time? >> >> Nothing, but that is fine. If a page is in use in two vcpus LDTs, it is >> expected to have a type refcount of 2. >> >>> Similarly, what >>> keeps both CPUs from calling get_page_type at the same time and >>> therefore losing track of the page type reference count? >> >> a cmpxchg() loop in the depths of __get_page_type(). >> >>> I don't see why vmalloc or vm_unmap_aliases would have anything to do >>> with this, though. > > > So just for kicks I made lazy_max_pages() return 0 to free vmaps immediately > and the problem went away. As far as I can tell, this affects TLB flushes but not unmaps. That means that my patch is totally bogus -- vm_unmap_aliases() *flushed* aliases but isn't involved in removing them from the page tables. That must be why xen_alloc_ldt and xen_set_ldt work today. So what does flushing the TLB have to do with anything? The only thing I can think of is that it might force some deferred hypercalls out. I can reproduce this easily on UP, so IPIs aren't involved. The other odd thing is that it seems like this happens when clearing the LDT and freeing the old one but not when setting the LDT and freeing the old one. This is plausibly related to the lazy mode in effect at the time, but I have no evidence for that. Two more data points: Putting xen_flush_mc before and after the SET_LDT multicall has no effect. Putting flush_tlb_all() in xen_free_ldt doesn't help either, while vm_unmap_aliases() in the exact same place does help. --Andy _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |