[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xenstored crashes with SIGSEGV

Hello Ian,

On 12.12.2014 17:32, Ian Campbell wrote:
> On Fri, 2014-12-12 at 17:14 +0100, Philipp Hahn wrote:
>> We did enable tracing and now have the xenstored-trace.log of one crash:
>> It contains 1.6 billion lines and is 83 GiB.
>> It just shows xenstored to crash on TRANSACTION_START.
>> Is there some tool to feed that trace back into a newly launched xenstored?
> Not that I know of I'm afraid.

Okay, then I have to continue with my own tool.

> Do you get a core dump when this happens? You might need to fiddle with
> ulimits (some distros disable by default). IIRC there is also some /proc
> nob which controls where core dumps go on the filesystem.

Not for that specific trace: We first enabled generating core files, but
only then discovered that this is not enough. Then we enabled
--trace-file, but on that host something reseted generating the core file.
We hopefully fixed all hosts so on the next crash we hopefully will get
both a core file and the trace.

>> My hope would be that xenstored crashes again, because then we could use
>> all those other tools like valgrind more easily.
> That would be handy. My fear would be that this bug is likely to be a
> race condition of some sort, and the granularity/accuracy of the
> playback would possibly need to be quite high to trigger the issue.

cxenstored looks single threaded to me, or am I wrong?

>>> Do you rm the xenstore db on boot? It might have a persistent
>>> corruption, aiui most folks using C xenstored are doing so or even
>>> placing it on a tmpfs for performance reasons.
>> We're using a tmpfs for /var/lib/xenstored/, as we had some sever
>> performance problem with something updating
>> /local/domain/0/backend/console/*/0/uuid too often, which put xenstored
>> in permanent D state.
> But this is just a process crashing and not the whole host so you still
> have the db file at the point of the crash?

Yes: Running xs_tdb_dump or tdb_dump on it didn't show anything
obviously wrong.

> It might be interesting to see what happens if you preserve the db and
> reboot arranging for the new xenstored to start with the old file. If
> the corruption is part of the file then maybe it can be induced to crash
> again more quickly.

Thanks for the pointer, will try.

Thank you again for your fast reply.
Philipp Hahn

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.