[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 24/35] xen/console: introduce hwdom_crashconsole=



On Thu, Dec 05, 2024 at 08:41:54PM -0800, Denis Mukhin via B4 Relay wrote:
> From: Denis Mukhin <dmukhin@xxxxxxxx>
> 
> The new command line switch `hwdom_crashconsole=BOOL` allows to switch serial
> console input focus to xen for machine state inspection using keyhandler
> mechanism after the hardware domain crashes.
> 
> The new command line switch is aliased via `dom0=...,crashconsole` knob.
> 
> Such functionality can be useful while debugging dom0 bringup.
> 
> Signed-off-by: Denis Mukhin <dmukhin@xxxxxxxx>
> ---
>  docs/misc/xen-command-line.pandoc |  5 +++++
>  xen/arch/x86/dom0_build.c         |  2 ++
>  xen/common/domain.c               | 14 +++++++++++++-
>  xen/include/xen/domain.h          |  1 +
>  4 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/misc/xen-command-line.pandoc 
> b/docs/misc/xen-command-line.pandoc
> index 
> 293dbc1a957ba6e668fd4d55d58e84f643822126..fb77d7dca1ea517f79d6713aa6909422f31e7724
>  100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -806,6 +806,7 @@ Specify the bit width of the DMA heap.
>  
>  ### dom0
>      = List of [ pv | pvh, shadow=<bool>, verbose=<bool>,
> +                crashconsole=<bool>,
>                  cpuid-faulting=<bool>, msr-relaxed=<bool> ] (x86)
>  
>      = List of [ sve=<integer> ] (Arm64)
> @@ -839,6 +840,10 @@ Controls for how dom0 is constructed on x86 systems.
>      information during the dom0 build.  It defaults to the compile time 
> choice
>      of `CONFIG_VERBOSE_DEBUG`.
>  
> +*   The `crashconsole` boolean instructs Xen to drop into emergency console
> +    in case of dom0 crash. May be useful for dom0 bringup on a custom

I think the 'a' is unneeded -> "on custom hardware."

I think however this would be clearer as:

"The `crashconsole` boolean instructs Xen to switch input console
focus to the hypervisor when dom0 shuts down and avoid performing
dom0 domain destruction.  Should only be used for debugging
purposes."

It's IMO not clear what "instructs Xen to drop into emergency console"
implies for Xen.

> +    hardware.
> +
>  *   The `cpuid-faulting` boolean is an interim option, is only applicable to
>      PV dom0, and defaults to true.
>  
> diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
> index 
> e8f5bf5447bc47a6daa3d95787106a4c11e80d31..706aeec0ecbb565a415edbfb33ca2fd72967c560
>  100644
> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -286,6 +286,8 @@ int __init parse_arch_dom0_param(const char *s, const 
> char *e)
>          opt_dom0_cpuid_faulting = val;
>      else if ( (val = parse_boolean("msr-relaxed", s, e)) >= 0 )
>          opt_dom0_msr_relaxed = val;
> +    else if ( (val = parse_boolean("crashconsole", s, e)) >= 0 )
> +        opt_hwdom_crashconsole = !!val;
>      else
>          return -EINVAL;
>  
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 
> aab546c0a8535e4f007cbbc9c5c552bcf66b5807..4fe69f294158dda7b2e0b9d98d49c34e04131cb8
>  100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -56,6 +56,13 @@ unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX;
>  bool opt_dom0_vcpus_pin;
>  boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin);
>  
> +/*
> + * Hardware domain crash handler: if true, do not halt machine, but switch to
> + * Xen console for debugging.
> + */
> +bool opt_hwdom_crashconsole;

__ro_after_init.

> +boolean_param("hwdom_crashconsole", opt_hwdom_crashconsole);

This option doesn't seem to be documented at all in
xen-command-line.pandoc?

> +
>  /* Protect updates/reads (resp.) of domain_list and domain_hash. */
>  DEFINE_SPINLOCK(domlist_update_lock);
>  DEFINE_RCU_READ_LOCK(domlist_read_lock);
> @@ -1138,7 +1145,12 @@ int domain_shutdown(struct domain *d, u8 reason)
>      reason = d->shutdown_code;
>  
>      if ( is_hardware_domain(d) )
> -        hwdom_shutdown(reason);
> +    {
> +        if ( opt_hwdom_crashconsole )
> +            console_set_owner(DOMID_XEN);

Don't you need to pause all domain vCPUs and return early here to
avoid executing the rest of the function, that will likely destroy the
domain?

Maybe there's something I'm missing that prevents the hardware domain
destruction.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.