|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Strange kernel BUG() on PV DomU boot
>>> On 22.06.12 at 14:26, Joanna Rutkowska <joanna@xxxxxxxxxxxxxxxxxxxxxx>
>>> wrote:
> On 06/22/12 14:21, Joanna Rutkowska wrote:
>> Hello,
>>
>> From time to time (every several weeks or even less) I run into a
>> strange Dom0 kernel BUG() that manifests itself with the following
>> message (see the end of the message). The Dom0 and VM kernels are 3.2.7
>> pvops, and the Xen hypervisor is 4.1.2 both with only some minor,
>> irrelevant (I think) modifications for Qubes.
>>
>> The bug is very hard to reproduce, but once this BUG() starts being
>> signaled, it consistently prevents me from starting any new VMs in the
>> system (e.g. tried over a dozen of times now, and every time the VM boot
>> fails).
>>
>> The following lines in the VM kernel are responsible for signaling the
>> BUG():
>>
>> if (HYPERVISOR_vcpu_op(VCPUOP_initialise, cpu, ctxt))
>> BUG();
>>
>> ...yet, there is nothing in the xl dmesg that would provide more info
>> why this hypercall fails. Ah, that's because there are not printk's in
>> the hypercall code:
>>
>> case VCPUOP_initialise:
>> if ( v->vcpu_info == &dummy_vcpu_info )
>> return -EINVAL;
>>
>> if ( (ctxt = xmalloc(struct vcpu_guest_context)) == NULL )
>> return -ENOMEM;
>>
>> if ( copy_from_guest(ctxt, arg, 1) )
>> {
>> xfree(ctxt);
>> return -EFAULT;
>> }
>>
>> domain_lock(d);
>> rc = -EEXIST;
>> if ( !v->is_initialised )
>> rc = boot_vcpu(d, vcpuid, ctxt);
>> domain_unlock(d);
>>
>> xfree(ctxt);
>> break;
>>
>> So, looking at the above it seems like it might be failing because of
>> xmalloc() fails, however Xen seems to have enough memory as reported by
>> xl info:
>>
>> total_memory : 8074
>> free_memory : 66
>> free_cpus : 0
>>
>> Any ideas what might be the cause?
>>
>> FWIW, below the actual oops message.
>>
>
> Ok, it seems like this was an out-of-memeory condition indeed, because
> once I did:
>
> xl mem-set 0 1800m
>
> and then quickly started a VM, it booted fine...
Had you looked at the error value in %rax, you would also
have seen that it's -ENOMEM. I suppose the problem here is
that a multi-page allocation was needed, yet only single
pages were available.
> Is there any proposal of how to handle out of memory conditions in Xen
> (like this one, as well as e.g. SWIOTLB problem) in a more user friendly
> way?
In 4.2, I hope we managed to remove all runtime allocations
larger than a page, so the particular situation here should arise
anymore.
As to more user-friendly - what do you think of? An error is an
error (and converting this to a meaningful, user visible message
is the responsibility of the entity receiving the error). In the
case at hand, printing an error message wouldn't meaningfully
increase user-friendliness imo.
> Any recommendations regarding the preferred minimum Xen free memory, as
> reported by xl info, that should be preserved in order to assure Xen
> runs smoothly?
In pre-4.2 Xen, there's not much you can do when memory gets
fragmented (otherwise you'd have to keep more than half the
memory in the box in the hypervisor). With multi-page runtime
allocations gone, you should be fine leaving just a minimal amount
to the hypervisor.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |