[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 19/30] panic: Add the panic hypervisor notifier list
- To: Petr Mladek <pmladek@xxxxxxxx>
- From: "Guilherme G. Piccoli" <gpiccoli@xxxxxxxxxx>
- Date: Wed, 18 May 2022 10:24:39 -0300
- Cc: Evan Green <evgreen@xxxxxxxxxxxx>, David Gow <davidgow@xxxxxxxxxx>, Julius Werner <jwerner@xxxxxxxxxxxx>, Scott Branden <scott.branden@xxxxxxxxxxxx>, bcm-kernel-feedback-list@xxxxxxxxxxxx, Sebastian Reichel <sre@xxxxxxxxxx>, Linux PM <linux-pm@xxxxxxxxxxxxxxx>, Florian Fainelli <f.fainelli@xxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, bhe@xxxxxxxxxx, kexec@xxxxxxxxxxxxxxxxxxx, LKML <linux-kernel@xxxxxxxxxxxxxxx>, linuxppc-dev@xxxxxxxxxxxxxxxx, linux-alpha@xxxxxxxxxxxxxxx, linux-arm Mailing List <linux-arm-kernel@xxxxxxxxxxxxxxxxxxx>, linux-edac@xxxxxxxxxxxxxxx, linux-hyperv@xxxxxxxxxxxxxxx, linux-leds@xxxxxxxxxxxxxxx, linux-mips@xxxxxxxxxxxxxxx, linux-parisc@xxxxxxxxxxxxxxx, linux-remoteproc@xxxxxxxxxxxxxxx, linux-s390@xxxxxxxxxxxxxxx, linux-tegra@xxxxxxxxxxxxxxx, linux-um@xxxxxxxxxxxxxxxxxxx, linux-xtensa@xxxxxxxxxxxxxxxx, netdev@xxxxxxxxxxxxxxx, openipmi-developer@xxxxxxxxxxxxxxxxxxxxx, rcu@xxxxxxxxxxxxxxx, sparclinux@xxxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, x86@xxxxxxxxxx, kernel-dev@xxxxxxxxxx, kernel@xxxxxxxxxxxx, halves@xxxxxxxxxxxxx, fabiomirmar@xxxxxxxxx, alejandro.j.jimenez@xxxxxxxxxx, Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>, Arnd Bergmann <arnd@xxxxxxxx>, Borislav Petkov <bp@xxxxxxxxx>, Jonathan Corbet <corbet@xxxxxxx>, d.hatayama@xxxxxxxxxxxxxx, dave.hansen@xxxxxxxxxxxxxxx, dyoung@xxxxxxxxxx, feng.tang@xxxxxxxxx, Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>, mikelley@xxxxxxxxxxxxx, hidehiro.kawai.ez@xxxxxxxxxxx, jgross@xxxxxxxx, john.ogness@xxxxxxxxxxxxx, Kees Cook <keescook@xxxxxxxxxxxx>, luto@xxxxxxxxxx, mhiramat@xxxxxxxxxx, mingo@xxxxxxxxxx, paulmck@xxxxxxxxxx, peterz@xxxxxxxxxxxxx, rostedt@xxxxxxxxxxx, senozhatsky@xxxxxxxxxxxx, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, vgoyal@xxxxxxxxxx, vkuznets@xxxxxxxxxx, Will Deacon <will@xxxxxxxxxx>, Alexander Gordeev <agordeev@xxxxxxxxxxxxx>, Andrea Parri <parri.andrea@xxxxxxxxx>, Ard Biesheuvel <ardb@xxxxxxxxxx>, Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>, Brian Norris <computersforpeace@xxxxxxxxx>, Christian Borntraeger <borntraeger@xxxxxxxxxxxxx>, Christophe JAILLET <christophe.jaillet@xxxxxxxxxx>, "David S. Miller" <davem@xxxxxxxxxxxxx>, Dexuan Cui <decui@xxxxxxxxxxxxx>, Doug Berger <opendmb@xxxxxxxxx>, Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>, Hari Bathini <hbathini@xxxxxxxxxxxxx>, Heiko Carstens <hca@xxxxxxxxxxxxx>, Justin Chen <justinpopo6@xxxxxxxxx>, "K. Y. Srinivasan" <kys@xxxxxxxxxxxxx>, Lee Jones <lee.jones@xxxxxxxxxx>, Markus Mayer <mmayer@xxxxxxxxxxxx>, Michael Ellerman <mpe@xxxxxxxxxxxxxx>, Mihai Carabas <mihai.carabas@xxxxxxxxxx>, Nicholas Piggin <npiggin@xxxxxxxxx>, Paul Mackerras <paulus@xxxxxxxxx>, Pavel Machek <pavel@xxxxxx>, Shile Zhang <shile.zhang@xxxxxxxxxxxxxxxxx>, Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>, Sven Schnelle <svens@xxxxxxxxxxxxx>, Thomas Bogendoerfer <tsbogend@xxxxxxxxxxxxxxxx>, Tianyu Lan <Tianyu.Lan@xxxxxxxxxxxxx>, Vasily Gorbik <gor@xxxxxxxxxxxxx>, Wang ShaoBo <bobo.shaobowang@xxxxxxxxxx>, Wei Liu <wei.liu@xxxxxxxxxx>, zhenwei pi <pizhenwei@xxxxxxxxxxxxx>, Stephen Boyd <swboyd@xxxxxxxxxxxx>
- Delivery-date: Wed, 18 May 2022 13:26:04 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
On 18/05/2022 04:33, Petr Mladek wrote:
> [...]
> Anyway, I would distinguish it the following way.
>
> + If the notifier is preserving kernel log then it should be ideally
> treated as kmsg_dump().
>
> + It the notifier is saving another debugging data then it better
> fits into the "hypervisor" notifier list.
>
>
Definitely, I agree - it's logical, since we want more info in the logs,
and happens some notifiers running in the informational list do that,
like ftrace_on_oops for example.
> Regarding the reliability. From my POV, any panic notifier enabled
> in a generic kernel should be reliable with more than 99,9%.
> Otherwise, they should not be in the notifier list at all.
>
> An exception would be a platform-specific notifier that is
> called only on some specific platform and developers maintaining
> this platform agree on this.
>
> The value "99,9%" is arbitrary. I am not sure if it is realistic
> even in the other code, for example, console_flush_on_panic()
> or emergency_restart(). I just want to point out that the border
> should be rather high. Otherwise we would back in the situation
> where people would want to disable particular notifiers.
>
Totally agree, these percentages are just an example, 50% is ridiculous
low reliability in my example heheh
But some notifiers deep dive in abstraction layers (like regmap or GPIO
stuff) and it's hard to determine the probability of a lock issue (take
a spinlock already taken inside regmap code and live-lock forever, for
example). These are better to run, if possible, later than kdump or even
info list.
Thanks again for the good analysis Petr!
Cheers,
Guilherme
|