[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen crashing when killing a domain with no VCPUs allocated



Hi George,

On 07/21/2014 11:33 AM, George Dunlap wrote:
> On 07/18/2014 09:26 PM, Julien Grall wrote:
>>
>> On 18/07/14 17:39, Ian Campbell wrote:
>>> On Fri, 2014-07-18 at 14:27 +0100, Julien Grall wrote:
>>>> Hi all,
>>>>
>>>> I've been played with the function alloc_vcpu on ARM. And I hit one
>>>> case
>>>> where this function can failed.
>>>>
>>>> During domain creation, the toolstack will call DOMCTL_max_vcpus
>>>> which may
>>>> fail, for instance because alloc_vcpu didn't succeed. In this case, the
>>>> toolstack will call DOMCTL_domaindestroy. And I got the below stack
>>>> trace.
>>>>
>>>> It can be reproduced on Xen 4.5 (and I also suspect Xen 4.4) by
>>>> returning
>>>> in an error in vcpu_initialize.
>>>>
>>>> I'm not sure how to correctly fix it.
>>> I think a simple check at the head of the function would be ok.
>>>
>>> Alternatively perhaps in sched_mode_domain, which could either detect
>>> this or could detect a domain in pool0 being moved to pool0 and short
>>> circuit.
>> I was thinking about the small fix below. If it's fine for everyone, I
>> can
>> send a patch next week.
>>
>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>> index e9eb0bc..c44d047 100644
>> --- a/xen/common/schedule.c
>> +++ b/xen/common/schedule.c
>> @@ -311,7 +311,7 @@ int sched_move_domain(struct domain *d, struct
>> cpupool *c)
>>       }
>>         /* Do we have vcpus already? If not, no need to update
>> node-affinity */
>> -    if ( d->vcpu )
>> +    if ( d->vcpu && d->vcpu[0] != NULL )
>>           domain_update_node_affinity(d);
> 
> So is the problem that we're allocating the vcpu array area, but not
> putting any vcpus in it?

Yes.


> Overall it seems like those checks for the existence of cpus should be
> moved into domain_update_node_affinity().  The ASSERT() there I think is
> just a sanity check to make sure we're not getting a ridiculous result
> out of our calculation; but of course if there actually are no vcpus,
> it's not ridiculous at all.
> 
> One solution might be to change the ASSERT to
> ASSERT(!cpumask_empty(dom_cpumask) || !d->vcpu || !d->vcpu[0]).  Then we
> could probably even remove the d->vcpu conditional when calling it.

This solution also works for me. Which change do you prefer?

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.