[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 24/35] xen/console: introduce hwdom_crashconsole=
On Thursday, December 12th, 2024 at 4:29 AM, Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote: > > > On Thu, Dec 05, 2024 at 08:41:54PM -0800, Denis Mukhin via B4 Relay wrote: > > > From: Denis Mukhin dmukhin@xxxxxxxx > > > > The new command line switch `hwdom_crashconsole=BOOL` allows to switch > > serial > > console input focus to xen for machine state inspection using keyhandler > > mechanism after the hardware domain crashes. > > > > The new command line switch is aliased via `dom0=...,crashconsole` knob. > > > > Such functionality can be useful while debugging dom0 bringup. > > > > Signed-off-by: Denis Mukhin dmukhin@xxxxxxxx > > --- > > docs/misc/xen-command-line.pandoc | 5 +++++ > > xen/arch/x86/dom0_build.c | 2 ++ > > xen/common/domain.c | 14 +++++++++++++- > > xen/include/xen/domain.h | 1 + > > 4 files changed, 21 insertions(+), 1 deletion(-) > > > > diff --git a/docs/misc/xen-command-line.pandoc > > b/docs/misc/xen-command-line.pandoc > > index > > 293dbc1a957ba6e668fd4d55d58e84f643822126..fb77d7dca1ea517f79d6713aa6909422f31e7724 > > 100644 > > --- a/docs/misc/xen-command-line.pandoc > > +++ b/docs/misc/xen-command-line.pandoc > > @@ -806,6 +806,7 @@ Specify the bit width of the DMA heap. > > > > ### dom0 > > = List of [ pv | pvh, shadow=<bool>, verbose=<bool>, > > + crashconsole=<bool>, > > cpuid-faulting=<bool>, msr-relaxed=<bool> ] (x86) > > > > = List of [ sve=<integer> ] (Arm64) > > @@ -839,6 +840,10 @@ Controls for how dom0 is constructed on x86 systems. > > information during the dom0 build. It defaults to the compile time choice > > of `CONFIG_VERBOSE_DEBUG`. > > > > +* The `crashconsole` boolean instructs Xen to drop into emergency console > > + in case of dom0 crash. May be useful for dom0 bringup on a custom > > > I think the 'a' is unneeded -> "on custom hardware." Yep. Thank you. > > > I think however this would be clearer as: > > "The `crashconsole` boolean instructs Xen to switch input console > focus to the hypervisor when dom0 shuts down and avoid performing > dom0 domain destruction. Should only be used for debugging > purposes." Fixed. > > It's IMO not clear what "instructs Xen to drop into emergency console" > implies for Xen. > > > + hardware. > > + > > * The `cpuid-faulting` boolean is an interim option, is only applicable to > > PV dom0, and defaults to true. > > > > diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c > > index > > e8f5bf5447bc47a6daa3d95787106a4c11e80d31..706aeec0ecbb565a415edbfb33ca2fd72967c560 > > 100644 > > --- a/xen/arch/x86/dom0_build.c > > +++ b/xen/arch/x86/dom0_build.c > > @@ -286,6 +286,8 @@ int __init parse_arch_dom0_param(const char *s, const > > char *e) > > opt_dom0_cpuid_faulting = val; > > else if ( (val = parse_boolean("msr-relaxed", s, e)) >= 0 ) > > opt_dom0_msr_relaxed = val; > > + else if ( (val = parse_boolean("crashconsole", s, e)) >= 0 ) > > + opt_hwdom_crashconsole = !!val; > > else > > return -EINVAL; > > > > diff --git a/xen/common/domain.c b/xen/common/domain.c > > index > > aab546c0a8535e4f007cbbc9c5c552bcf66b5807..4fe69f294158dda7b2e0b9d98d49c34e04131cb8 > > 100644 > > --- a/xen/common/domain.c > > +++ b/xen/common/domain.c > > @@ -56,6 +56,13 @@ unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX; > > bool opt_dom0_vcpus_pin; > > boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin); > > > > +/* > > + * Hardware domain crash handler: if true, do not halt machine, but switch > > to > > + * Xen console for debugging. > > + */ > > +bool opt_hwdom_crashconsole; > > > __ro_after_init. Done. > > > +boolean_param("hwdom_crashconsole", opt_hwdom_crashconsole); > > > This option doesn't seem to be documented at all in > xen-command-line.pandoc? Thanks; fixed. > > > + > > /* Protect updates/reads (resp.) of domain_list and domain_hash. */ > > DEFINE_SPINLOCK(domlist_update_lock); > > DEFINE_RCU_READ_LOCK(domlist_read_lock); > > @@ -1138,7 +1145,12 @@ int domain_shutdown(struct domain *d, u8 reason) > > reason = d->shutdown_code; > > > > if ( is_hardware_domain(d) ) > > - hwdom_shutdown(reason); > > + { > > + if ( opt_hwdom_crashconsole ) > > + console_set_owner(DOMID_XEN); > > > Don't you need to pause all domain vCPUs and return early here to > avoid executing the rest of the function, that will likely destroy the > domain? > > Maybe there's something I'm missing that prevents the hardware domain > destruction. The point of this change is to drop the user to Xen console on crash instead of unconditional system restart. TBH, I did not plan for "freezing" the domain state. I also tested w/ non-hyperlaunch PV dom0 only (I used kexec to start PV dom0 which was crashing). My understanding, if I followed the code correctly, all is fine. domain_destroy() is not called because asm_domain_crash_synchronous(), which handles the crash, just spins around do_softirq(): void asm_domain_crash_synchronous(unsigned long addr) { ... __domain_crash(current->domain); // that will call domain_crash() for ( ; ; ) do_softirq(); } > > Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |