[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass

To: Jan Beulich <JBeulich@xxxxxxxx>
From: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
Date: Tue, 29 Aug 2017 08:45:53 -0400
Cc: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>, osstest-admin@xxxxxxxxxxxxxx, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>
Delivery-date: Tue, 29 Aug 2017 12:46:18 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 08/29/2017 04:07 AM, Jan Beulich wrote:
>>>> On 28.08.17 at 17:36, <boris.ostrovsky@xxxxxxxxxx> wrote:
>> On 08/28/2017 10:52 AM, Jan Beulich wrote:
>>>>>> On 28.08.17 at 16:24, <boris.ostrovsky@xxxxxxxxxx> wrote:
>>>>>> As for periodically testing process_pending_softirqs() we may still want
>>>>>> to do this in alloc_heap_pages(), even without CONFIG_SCRUB_DEBUG.
>>>>> For my taste, alloc_heap_pages() is the wrong place for such
>>>>> calls.
>>>> But the loop is in alloc_heap_pages() --- where else would you be testing?
>>> It can only reasonably be the callers of alloc_heap_pages() imo.
>>> A single call to it should never trigger the watchdog, 
>> check_one_page() is rather slow so for a large order allocation even
>> with clean heap the 'for' loop may take quite some time. Whether it
>> could trip the watchdog -- I don't know.
> If that was a problem, we'd have to think about shortening the
> loop. I stand by my assertion that nowhere down from
> alloc_heap_pages() should be any invocation of
> process_pending_softirqs() - it is simply too risky, as we don't
> know what state we're in. One thing I could imagine to do is not
> check the entire page, but (randomly?) pick a couple of locations
> to check. But first of all we really need to be clear about whether
> it's really a single alloc_heap_pages() invocation that trips the
> watchdog, or whether something can be done about it in the
> caller(s).

At least one of the crashes was from alloc_chunk()->free_heap_pages(),
i.e. not from inside alloc_heap_pages()' loop. My proposal was not
necessarily based on the specific crashes in this flight (this issue
will be addressed by the patches I sent yesterday) but rather as a
general suggestion. But I understand that calling alloc_heap_pages()
from alloc_heap_pages() may not be a great idea.

I am somewhat puzzled though by the fact that I haven't seen this in my
testing --- I was creating/destroying very large guests (> 1TB) in
parallel so there must have been loops over high orders and I never had
a watchdog go off. And my dom0s were quite large too while the one in
this flight is only 512M.

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
  - From: Jan Beulich

References:
- [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
  - From: osstest service owner
- Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
  - From: Jan Beulich
- Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
  - From: Boris Ostrovsky
- Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
  - From: Jan Beulich
- Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
  - From: Boris Ostrovsky
- Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
  - From: Jan Beulich
- Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
  - From: Boris Ostrovsky
- Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
  - From: Jan Beulich
- Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
  - From: Boris Ostrovsky
- Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
  - From: Jan Beulich

Prev by Date: Re: [Xen-devel] [PATCH 2/5] x86/pv: map_ldt_shadow_page() cleanup
Next by Date: Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
Previous by thread: Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
Next by thread: Re: [Xen-devel] [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.