[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.5 random freeze question



On 11/14/2014 04:22 PM, Andrii Tseglytskyi wrote:
> On Fri, Nov 14, 2014 at 6:15 PM, Stefano Stabellini
> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
>> On Fri, 14 Nov 2014, Andrii Tseglytskyi wrote:
>>> On Fri, Nov 14, 2014 at 5:22 PM, Stefano Stabellini
>>> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
>>>> On Fri, 14 Nov 2014, Andrii Tseglytskyi wrote:
>>>>> On Fri, Nov 14, 2014 at 4:35 PM, Stefano Stabellini
>>>>> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
>>>>>> On Fri, 14 Nov 2014, Andrii Tseglytskyi wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I observe system freeze on latest xen/master branch.
>>>>>>>
>>>>>>> My setup is:
>>>>>>>
>>>>>>> - Jacinto 6 evm board (OMAP5)
>>>>>>> - Latest Xen 4.5.0-rc2 as hypervisor
>>>>>>> - Linux 3.8 as dom0, running on 2 vcpus
>>>>>>> - Android 4.3 as domU (running on Linux kernel 3.8, 2 vcpus)
>>>>>>> - XSM feature is disabled
>>>>>>> - gcc version 4.7.3 20130328 (prerelease) (crosstool-NG
>>>>>>> linaro-1.13.1-4.7-2013.04-20130415 - Linaro GCC 2013.04) as cross
>>>>>>> compiler
>>>>>>>
>>>>>>> Freeze occurs in random moment of time during creation of domU domain.
>>>>>>> Even Xen console may be not available after freeze.
>>>>>>> Can someone suggest - what it can be? Maybe some weak places in new
>>>>>>> code? Maybe new gic, which was reworked a lot or something else?
>>>>>>>
>>>>>>> Thank you in advance for any suggestions.
>>>>>>
>>>>>> Is this really 3.8 or 3.18?
>>>>>
>>>>> We have 3.8 in both dom0 and domU
>>>>>
>>>>>> 3.8 is pretty old and doesn't have any of
>>>>>> the fixes to be able to safely do dma involving guest pages to
>>>>>> non-coherent devices.
>>>>>
>>>>> This is a good point. Now we are migrating to 3.12 kernel in dom0. But
>>>>> Android will remain on 3.8. Will it help ?
>>>>> Maybe you can point me to any tree with proper DMA fixes? Note: if you
>>>>> are talking about SWIOTLB - we have your latest one, retrieved from
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git
>>>>> branch:swiotlb-xen-9.1
>>>>
>>>> The last and most stable series is:
>>>>
>>>> http://marc.info/?l=linux-kernel&m=141579241729749&w=2
>>>>
>>>
>>> Thanks  - I'll try this series anyway.
>>>
>>>> But thinking more about this, I doubt that it is a dma problem, because
>>>> you would most probably see various kind of error messages, not a
>>>> freeze.
>>>>
>>>
>>> Agree.
>>>
>>>>
>>>>>> Where are you storing the guest disk images?
>>>>>
>>>>> SATA drive, dedicated to dom0, its controller has its own DMA
>>>>
>>>> Are they on file or on lvm volumes?
>>>
>>> Images are on file.
>>>
>>> Also note - freeze depends on system load. It reproduces more
>>> frequently if I start Android + QNX + all frontends/backends drivers.
>>> Starting Android only without any addition drivers works more less
>>> stable. It looks like issue is reproduced when domU starts in parallel
>>> with backends drivers in dom0.
>>> But the same works fine with old Xen 4.4.
>>
>> In my experience freezes like the one you are describing are due to
>> interrupt related bugs or deadlocks. Both of them are hard to track
>> down. If you can reproduce it reliably maybe you could bisect it.
> 
> Agree. I suspect that new gic series impacts on this. In very few
> moments when xen console is available after freeze I see that dom0
> code stacks around kernel lock_release() or  handle_IPI() functions

I would be surprised that the next GIC series impact this code as the
next driver is only compiled for arm64 (GICv3 doesn't exist on arm32).
Though, there was some refactoring.

The interrupt management has also been reworked for Xen 4.5 to avoid
maintenance interrupt. I would give a look on this part.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.