[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] HVMs terminating as (null)



On Sat, Nov 23, 2013 at 8:26 PM, Andrew Cooper
<andrew.cooper3@xxxxxxxxxx> wrote:
> On 23/11/13 20:09, Steven Haigh wrote:
>> On 24/11/13 07:03, Andrew Cooper wrote:
>>> On 23/11/13 19:56, Steven Haigh wrote:
>>>> On 24/11/13 06:38, Steven Haigh wrote:
>>>>> On 24/11/13 06:27, Olaf Hering wrote:
>>>>>> On Sun, Nov 24, Steven Haigh wrote:
>>>>>>
>>>>>>> Running Xen 4.2.3 with all the current XSA fixes.
>>>>>>
>>>>>> How exactly did you start the guests?
>>>>>
>>>>> The DomUs were started with: xl create /etc/xen/<configfile>
>>>>>
>>>>>> Does 'ps faxu' show qemu processes for the listed domain_ids?
>>>>>> What is the 'xenstore-ls -f | sort' output?
>>>>>
>>>>> I'll have to check this when I manage to reproduce it. So far, I have
>>>>> been unable to get a reliable way to reproduce it. I managed to get a
>>>>> system to do it every time a HVM DomU was shutdown OR restarted - but
>>>>> after a reboot of the Dom0 I can't get it into that state again.
>>>>>
>>>>> As soon as I can get a system in this state again, I'll leave it to see
>>>>> what information I can extract.
>>>>
>>>> Ha! As always, as soon as I send this, I notice its happened on a Dom0.
>>>>
>>>> # xl list
>>>> Name                                        ID   Mem VCPUs      State
>>>> Time(s)
>>>> Domain-0                                     0  1579     2     r-----
>>>>  2731.3
>>>> planner.vm                                   1  1013     1     -b----
>>>>   189.3
>>>> (null)                                       2     0     1     --psrd
>>>>   301.1
>>>> tracker.vm                                   3  1013     2     -b----
>>>>   834.4
>>>>
>>>> Attached is the output of:
>>>> # xl debug-keys q
>>>> # xl dmesg  > xen-dmesg.log
>>>> # gzip xen-dmesg.log
>>>
>>> Ok - from dmesg.
>>>
>>> (XEN) General information for domain 2:
>>> (XEN)     refcnt=1 dying=2 pause_count=2
>>> (XEN)     nr_pages=2 xenheap_pages=0 shared_pages=0 paged_pages=0
>>> dirty_cpus={} max_pages=262400
>>> (XEN)     handle=ef58ef1a-784d-4e59-8079-42bdee87f219 vm_assist=00000000
>>> (XEN)     paging assistance: hap refcounts translate external
>>> ...
>>> (XEN) Memory pages belonging to domain 2:
>>> (XEN)     DomPage 00000000000866e0: caf=00000001, taf=0000000000000000
>>> (XEN)     DomPage 00000000000866e1: caf=00000001, taf=0000000000000000
>>> (XEN)     PoD entries=0 cachesize=0
>>>
>>>
>>> So there are indeed two outstanding pages causing this domain to become
>>> a zombie.  They are normal pages, with 1 outstanding ref.
>>>
>>> Can you collect "xl debug-keys g" as well?
>>
>> Sure - attached.
>
> (XEN)       -------- active --------       -------- shared --------
> (XEN) [ref] localdom mfn      pin          localdom gmfn     flags
> (XEN) grant-table for remote domain:    2 (v1)
> (XEN) [16302]        0 0x0866e1 0x00000001          0 0x0064e1 0x19
> (XEN) [16320]        0 0x0866e0 0x00000001          0 0x0064e0 0x19
>
> Ok - so domain 2 has two outstanding grants.  This explains why it is a
> zombie.
>
> Both these grants are GFT_writing | GFT_reading | GFT_permit_access, but
> seemingly unmapped.
>

I didn't go through the whole thread, is there any chance you upgraded
your Dom0 kernel?

It is possible that you miss some upstream patches.

Check out <527B8465.6050901@xxxxxxxxxx>

Wei.

> I will have to defer to someone who knows the grant code better.  Is it
> possible for a domain to be a zombie just because it has two grants it
> hasn't manually invalidated?
>
> ~Andrew
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.