[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option
On 29/07/15 06:28, Andy Lutomirski wrote: > On Tue, Jul 28, 2015 at 8:01 PM, Boris Ostrovsky > <boris.ostrovsky@xxxxxxxxxx> wrote: >> On 07/28/2015 08:47 PM, Andrew Cooper wrote: >>> On 29/07/2015 01:21, Andy Lutomirski wrote: >>>> On Tue, Jul 28, 2015 at 10:10 AM, Boris Ostrovsky >>>> <boris.ostrovsky@xxxxxxxxxx> wrote: >>>>> On 07/28/2015 01:07 PM, Andy Lutomirski wrote: >>>>>> On Tue, Jul 28, 2015 at 9:30 AM, Andrew Cooper >>>>>> <andrew.cooper3@xxxxxxxxxx> wrote: >>>>>>> I suspect that the set_ldt(NULL, 0) call hasn't reached Xen before >>>>>>> xen_free_ldt() is attempting to nab back the pages which Xen still has >>>>>>> mapped as an LDT. >>>>>>> >>>>>> I just instrumented it with yet more LSL instructions. I'm pretty >>>>>> sure that set_ldt really is clearing at least LDT entry zero. >>>>>> Nonetheless the free_ldt call still oopses. >>>>>> >>>>> Yes, I added some instrumentation to the hypervisor and we definitely >>>>> set >>>>> LDT to NULL before failing. >>>>> >>>>> -boris >>>> Looking at map_ldt_shadow_page: what keeps shadow_ldt_mapcnt from >>>> getting incremented once on each CPU at the same time if both CPUs >>>> fault in the same shadow LDT page at the same time? >>> Nothing, but that is fine. If a page is in use in two vcpus LDTs, it is >>> expected to have a type refcount of 2. >>> >>>> Similarly, what >>>> keeps both CPUs from calling get_page_type at the same time and >>>> therefore losing track of the page type reference count? >>> a cmpxchg() loop in the depths of __get_page_type(). >>> >>>> I don't see why vmalloc or vm_unmap_aliases would have anything to do >>>> with this, though. >> >> So just for kicks I made lazy_max_pages() return 0 to free vmaps immediately >> and the problem went away. > As far as I can tell, this affects TLB flushes but not unmaps. That > means that my patch is totally bogus -- vm_unmap_aliases() *flushed* > aliases but isn't involved in removing them from the page tables. > That must be why xen_alloc_ldt and xen_set_ldt work today. > > So what does flushing the TLB have to do with anything? The only > thing I can think of is that it might force some deferred hypercalls > out. I can reproduce this easily on UP, so IPIs aren't involved. > > The other odd thing is that it seems like this happens when clearing > the LDT and freeing the old one but not when setting the LDT and > freeing the old one. This is plausibly related to the lazy mode in > effect at the time, but I have no evidence for that. > > Two more data points: Putting xen_flush_mc before and after the > SET_LDT multicall has no effect. Putting flush_tlb_all() in > xen_free_ldt doesn't help either, while vm_unmap_aliases() in the > exact same place does help. FYI, I have got a repro now and am investigating. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |