[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Dom0 crash with old style AMD NUMA detection



On 09/18/2012 03:44 PM, Konrad Rzeszutek Wilk wrote:
On Tue, Sep 18, 2012 at 11:57:33AM +0200, Andre Przywara wrote:
On 09/17/2012 09:14 PM, Konrad Rzeszutek Wilk wrote:
On Mon, Sep 17, 2012 at 09:29:22AM +0200, Andre Przywara wrote:
On 09/14/2012 08:58 PM, Konrad Rzeszutek Wilk wrote:
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
(XEN) Domain 0 crashed: 'noreboot' set - not rebooting.



The obvious solution would be to explicitly deny northbridge scanning
when running as Dom0, though I am not sure how to implement this without
upsetting the other kernel folks about "that crappy Xen thing" again ;-)

Heh.
Is there a numa=0 option that could be used to override it to turn it
off?

Not compile tested.. but was thinking something like this:

ping?

That looks good to me - at least for the time being.

OK, can I've your Tested-by/Acked-by on it pls?

I just want to check how this interacts with upcoming Dom0 NUMA
support. It wouldn't be too clever if we deliberately disable NUMA

We can always revert this patch in future versions of Linux.

I don't like this idea. Then we have Linux kernel up to 3.5 working
and say from 3.8 on again, but 3.6 and 3.7 cannot use NUMA. That
would be pretty unfortunate.

Huh? v3.5 working? But it never worked? I would say turn off the NUMA
detection (keep in mind it still will set up the dummy NUMA stuff)
until there are some PV NUMA capability and then we can revert it.

I was under the impression that somehow the Dom0 NUMA would be made compatible, using some of the existing discovery mechanisms. So we would enable the hypervisor, and Dom0 would just magically start working. I am probably rooted too much in the HVM world ;-)


I haven't checked back with Dario, but I'd suspect that we use ACPI
for injecting NUMA topology into Dom0. Even if not, a general
"numa=off" for Dom0 is too much of a sledgehammer for me.

How would you inject it in Dom0? It s a PV guest so the hypervisor would
have to tweak the SRAT/SLIT tables. That is not going to happen
in the very short term.. And I don't recall seeing any patches, so
the dom0 NUMA support is right now non-existent?

Right, I just don't wanted to slam the door deliberately. Thinking more about this, we probably need some kind of PV enablement in Dom0, even if we could somehow use the ACPI tables (and thus the ACPI parsing code). If this is the case, we could at the same time remove this "force numa off" patch.

I am almost convinced by now.
Just waiting for Dario's opinion for a few more hours and will send my final opinion later today. If you cannot wait, tell me.


Andre.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.