[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] General protection fault in netback

2012/2/21 Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>:

AS>> With custom-built kernel, I didn't yet see any GPF, but screen garbling
AS>> happens almost every time when a DomU is starting or stopping:
AS>> the whole graphical is painted with either black or not-so-random garbage.

KRW> So... I am curious, what graphic card do you have and do you get any of
KRW> these Red Hat BZs?  RH BZ# 742032, 787403, and 745574?
KRW> There is a bug in 3.2 when using radeon or nouveau for a lengthy time
KRW> that ends up "corrupting" memory. The workaround is 'nopat' kernel arg.

My idea is that my custom-built kernel can't be considered a trusted
proving ground,
as it is of very low quality. Video issue is just the most obvious
example. Another
indisputable example is how the Dom0 reboots: instead of simple CPU restart,
the whole system goes into soft-off for several seconds, then wakes back.
When I boot this kernel in bare-metal mode (without Xen VMM), none of those
happens: GUI is accelerated (at least in 2D; I don't use OpenGL desktop),
screen is not garbled at login and logout dialogs, system reboots quickly.

Anyway, I tried your solution with "nopat". It didn't worked: with 4
DomUs running
for a minute and then shut down in reverse order (4th, 3rd, 2nd, 1st),
the screen
went black right between the 3rd VM was completely shut and 2nd VM was
requested to shut. There was no "lengthy time" of Dom0 running, my video adapter
is neither nVidia nor ATi, but an integrated Intel HD Graphics 2000
using i915 driver,
and I see no similarities to the Red Hat bugs mentioned by you.

KRW> Can you include more details on your machine?

My guess is that it is not just my hardware that causes GPF, but either
a bug in netback module, or a compiler issue for specific combination of Xen
(and/or particularly netback) together with openSUSE build technology.

As an example of the latter, look again at the Novell BZ #727081 mentioned
in the original post — the comment #30 says: "The compiler apparently makes use
of the 128-byte area called 'red zone' in the ABI, and this is incompatible
with xc_cpuid_x86.c:cpuid() using pushes and pops around the cpuid instruction".
The consequence is that, on some machines, libxenguest segfaults when you
try to start a DomU. With Core i7-920 there is no problem, but with Core i5-2300
I faced that issue, and wonder whether the same incompatibility can take place
in netback module. I though the traceback gives some hints on where to debug.

My specs are:
MB: Asus P8H67-M (Intel H67 chipset)
CPU: Intel Core i5 model 2300 (Turbo mode disabled)
RAM: 12GB DDR3-1333 non-ECC (recently checked by MemTest86+ 4.20)
Video: Intel HD Graphics 2000 (integrated into CPU)
Network: dedicated soft-bridge for most DomUs,
 + bridged Realtek RTL8111E for gateway DomU (not with CARP)

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.