[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
On 08/29/2017 04:07 AM, Jan Beulich wrote: >>>> On 28.08.17 at 17:36, <boris.ostrovsky@xxxxxxxxxx> wrote: >> On 08/28/2017 10:52 AM, Jan Beulich wrote: >>>>>> On 28.08.17 at 16:24, <boris.ostrovsky@xxxxxxxxxx> wrote: >>>>>> As for periodically testing process_pending_softirqs() we may still want >>>>>> to do this in alloc_heap_pages(), even without CONFIG_SCRUB_DEBUG. >>>>> For my taste, alloc_heap_pages() is the wrong place for such >>>>> calls. >>>> But the loop is in alloc_heap_pages() --- where else would you be testing? >>> It can only reasonably be the callers of alloc_heap_pages() imo. >>> A single call to it should never trigger the watchdog, >> check_one_page() is rather slow so for a large order allocation even >> with clean heap the 'for' loop may take quite some time. Whether it >> could trip the watchdog -- I don't know. > If that was a problem, we'd have to think about shortening the > loop. I stand by my assertion that nowhere down from > alloc_heap_pages() should be any invocation of > process_pending_softirqs() - it is simply too risky, as we don't > know what state we're in. One thing I could imagine to do is not > check the entire page, but (randomly?) pick a couple of locations > to check. But first of all we really need to be clear about whether > it's really a single alloc_heap_pages() invocation that trips the > watchdog, or whether something can be done about it in the > caller(s). At least one of the crashes was from alloc_chunk()->free_heap_pages(), i.e. not from inside alloc_heap_pages()' loop. My proposal was not necessarily based on the specific crashes in this flight (this issue will be addressed by the patches I sent yesterday) but rather as a general suggestion. But I understand that calling alloc_heap_pages() from alloc_heap_pages() may not be a great idea. I am somewhat puzzled though by the fact that I haven't seen this in my testing --- I was creating/destroying very large guests (> 1TB) in parallel so there must have been loops over high orders and I never had a watchdog go off. And my dom0s were quite large too while the one in this flight is only 512M. -boris _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |