Re: [Xen-devel] [xen-unstable test] 113562: regressions - FAIL

On 18/09/17 13:05, George Dunlap wrote:
> On 09/18/2017 11:46 AM, Roger Pau Monné wrote:
>> On Mon, Sep 18, 2017 at 11:15:03AM +0100, George Dunlap wrote:
>>> On 09/18/2017 10:45 AM, Roger Pau Monné wrote:
>>>> On Mon, Sep 18, 2017 at 10:37:58AM +0100, Wei Liu wrote:
>>>>> On Mon, Sep 18, 2017 at 08:36:03AM +0000, osstest service owner wrote:
>>>>>> flight 113562 xen-unstable real [real]
>>>>>> http://logs.test-lab.xenproject.org/osstest/logs/113562/
>>>>>> Regressions :-(
>>>>>> Tests which did not succeed and are blocking,
>>>>>> including tests which could not be run:
>>>>>>  test-amd64-amd64-xl-credit2  15 guest-saverestore        fail REGR. vs. 
>>>>>> 113387
>>>>> There appears to be a bug:
>>>>> http://logs.test-lab.xenproject.org/osstest/logs/113562/test-amd64-amd64-xl-credit2/serial-godello0.log
>>>>> Sep 18 01:14:28.803062 (XEN) Xen BUG at spinlock.c:47
>>>> Seem to be caused because budget_lock is sometimes locked with irqsave
>>>> while others not.
>>> Just wondering where you're getting the budget lock from?  The call
>>> stack in that link makes it look like it's the RCU clean-up triggering a
>>> domain destroy.  (Haven't looked deeper into the specific line numbers.)
>> Just skimmed over the commit and jumped into conclusions too fast. As
>> you mention later the issue is calling xfree with interrupts disabled
>> in csched2_free_domdata.
>> I would rather prefer budget_lock to be always locked with the
>> irqsave/restore variant to make what you mention above more obvious,
>> but that's just a question of taste.
> I *think* at some point in the past we had a discussion about this and
> someone (perhaps Jan?) said if we always know the irqs are disabled we
> shouldn't call the _irqsave() version, to save cpu cycles.
> Personally I think the ASSERT()s are clear enough to people familiar
> with the scheduling code.

Why don't we add _irqoff variants of the locks containing the ASSERTion
that interrupts are really off? This would save the additional
instructions of the irqsave/restore variants and make it very clear that
no violation of the lock interface is happening.


