[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MINI-OS PATCH 01/12] kexec: add kexec framework


  • To: Jason Andryuk <jandryuk@xxxxxxxxx>
  • From: Jürgen Groß <jgross@xxxxxxxx>
  • Date: Mon, 16 Jun 2025 07:40:31 +0200
  • Autocrypt: addr=jgross@xxxxxxxx; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNH0p1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmNvbT7CwHkEEwECACMFAlOMcK8CGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRCw3p3WKL8TL8eZB/9G0juS/kDY9LhEXseh mE9U+iA1VsLhgDqVbsOtZ/S14LRFHczNd/Lqkn7souCSoyWsBs3/wO+OjPvxf7m+Ef+sMtr0 G5lCWEWa9wa0IXx5HRPW/ScL+e4AVUbL7rurYMfwCzco+7TfjhMEOkC+va5gzi1KrErgNRHH kg3PhlnRY0Udyqx++UYkAsN4TQuEhNN32MvN0Np3WlBJOgKcuXpIElmMM5f1BBzJSKBkW0Jc Wy3h2Wy912vHKpPV/Xv7ZwVJ27v7KcuZcErtptDevAljxJtE7aJG6WiBzm+v9EswyWxwMCIO RoVBYuiocc51872tRGywc03xaQydB+9R7BHPzsBNBFOMcBYBCADLMfoA44MwGOB9YT1V4KCy vAfd7E0BTfaAurbG+Olacciz3yd09QOmejFZC6AnoykydyvTFLAWYcSCdISMr88COmmCbJzn sHAogjexXiif6ANUUlHpjxlHCCcELmZUzomNDnEOTxZFeWMTFF9Rf2k2F0Tl4E5kmsNGgtSa aMO0rNZoOEiD/7UfPP3dfh8JCQ1VtUUsQtT1sxos8Eb/HmriJhnaTZ7Hp3jtgTVkV0ybpgFg w6WMaRkrBh17mV0z2ajjmabB7SJxcouSkR0hcpNl4oM74d2/VqoW4BxxxOD1FcNCObCELfIS auZx+XT6s+CE7Qi/c44ibBMR7hyjdzWbABEBAAHCwF8EGAECAAkFAlOMcBYCGwwACgkQsN6d 1ii/Ey9D+Af/WFr3q+bg/8v5tCknCtn92d5lyYTBNt7xgWzDZX8G6/pngzKyWfedArllp0Pn fgIXtMNV+3t8Li1Tg843EXkP7+2+CQ98MB8XvvPLYAfW8nNDV85TyVgWlldNcgdv7nn1Sq8g HwB2BHdIAkYce3hEoDQXt/mKlgEGsLpzJcnLKimtPXQQy9TxUaLBe9PInPd+Ohix0XOlY+Uk QFEx50Ki3rSDl2Zt2tnkNYKUCvTJq7jvOlaPd6d/W0tZqpyy7KVay+K4aMobDsodB3dvEAs6 ScCnh03dDAFgIq5nsB11j3KPKdVoPlfucX2c7kGNH+LUMbzqV6beIENfNexkOfxHfw==
  • Cc: minios-devel@xxxxxxxxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, samuel.thibault@xxxxxxxxxxxx
  • Delivery-date: Mon, 16 Jun 2025 05:40:43 +0000
  • List-id: Mini-os development list <minios-devel.lists.xenproject.org>

Jason,

thanks for having a look at the series! I very much appreciate that!

On 14.06.25 18:40, Jason Andryuk wrote:
On Fri, Mar 21, 2025 at 5:25 AM Juergen Gross <jgross@xxxxxxxx> wrote:

Add a new config option CONFIG_KEXEC for support of kexec-ing into a
new mini-os kernel. Add a related kexec.c source and a kexec.h header.

For now allow CONFIG_KEXEC to be set only for PVH variant of mini-os.

Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
---


diff --git a/arch/x86/testbuild/all-yes b/arch/x86/testbuild/all-yes
index 8ae489a4..99ba75dd 100644
--- a/arch/x86/testbuild/all-yes
+++ b/arch/x86/testbuild/all-yes
@@ -19,3 +19,5 @@ CONFIG_BALLOON = y
  CONFIG_USE_XEN_CONSOLE = y
  # The following are special: they need support from outside
  CONFIG_LWIP = n
+# KEXEC only without PARAVIRT

Maybe: "KEXEC not implemented for PARAVIRT"?

Fine with me.


+CONFIG_KEXEC = n

diff --git a/kexec.c b/kexec.c
new file mode 100644
index 00000000..53528169
--- /dev/null
+++ b/kexec.c
@@ -0,0 +1,62 @@

+
+#include <errno.h>
+#include <mini-os/os.h>
+#include <mini-os/lib.h>
+#include <mini-os/kexec.h>
+
+/*
+ * General approach for kexec support (PVH only) is as follows:
+ *
+ * - New kernel needs to be in memory in form of a ELF file in a virtual

"in the form of an ELF binary"

+ *   memory region.

Maybe just "The new kernel needs to be an ELF binary loaded into the
Mini-OS address space"?

The "virtual memory region" is quite important, as this allows to handle
conflicts with the target memory layout on a per-page basis.


+ * - A new start_info structure is constructed in memory with the final
+ *   memory locations included.
+ * - All memory areas needed for kexec execution are being finalized.
+ * - From here on a graceful failure is no longer possible.
+ * - Grants and event channels are torn down.
+ * - A temporary set of page tables is constructed at a location where it
+ *   doesn't conflict with old and new kernel or start_info.
+ * - The final kexec execution stage is copied to a memory area below 4G which
+ *   doesn't conflict with the target areas of kernel etc.
+ * - Cr3 is switched to the new set of page tables.
+ * - Execution continues in the final execution stage.
+ * - All data is copied to its final addresses.
+ * - Processing is switched to 32-bit mode without address translation.

Maybe "CPU is switched to 32-bit mode with paging disabled."?

Okay.


Is the following memory layout correct?

[ 0 ... 8MB ] ... [ X ... X + Y ] ... [ Z ...      ]
  Old stubdom        New stubdom         kexec code

With:
O: old stubdom kernel
P: active page tables
N: new stubdom kernel
Z: kexec code.

The guest physical memory layout is more like:
OPOOONP.NN.N.NNN..ZNN..PP..

The target layout of this example (before the final kexec stage) will be:
O.OOO....N.N.NNNP.ZNNP.PPNN

Note that all conflicting N and P entries have been moved to a position
behind the target position of the new kernel. This includes the page
tables in the old kernel which were pre-populated at boot time.

And before passing control to the new kernel it will be:
NNNNNNNNN.........Z........

kexec code copies New stubdom to 0 and later jumps to New stubdom @ 0

Kind of. The "0" is not hard wired in the kexec code.

The temporary page tables are to allow old stubdom and kexec code to
be called while overwriting the "Old stubdom" range which would
include the page tables originally used?  Or it can only run the kexec
code once old stubdom is overwritten, right?

Yes.

I just realized that some of the comments are stale now. The current
implementation doesn't setup a new set of page tables, but is tweaking
the existing one to avoid conflicts.

I think some comments tweaks would be helpful, but code-wise
everything is okay, so:

Reviewed-by: Jason Andryuk <jason.andryuk@xxxxxxx>

Thanks,


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.