[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC 3/6] linux-stubdomain: Build a disk image.



On Wed, 2013-04-17 at 20:09 +0100, Anthony PERARD wrote:
> This patch build a disk image intend to be mounted as rootfs by the
> stub-domain. It is build using the 'debugfs' tool and make a ext2 fs.

You seem to have some vestigial code for building a cpio style initramfs
-- what was wrong with that approach? On the face of it that would seem
simpler and less "hacky" than the tricks you have to play with debugfs.

[...]
> diff --git a/stubdom-linux/extra/initscript b/stubdom-linux/extra/initscript
> new file mode 100644
> index 0000000..122892f
> --- /dev/null
> +++ b/stubdom-linux/extra/initscript
> @@ -0,0 +1,40 @@
> +#!/bin/busybox sh

This can become the more normal /bin/sh if you put the appropriate
symlink in the initrd?

> +
> +_initscript_panic() {
> +  sleep 10
> +}

Erm...

> +trap _initscript_panic 0
> +
> +set -e
> +set -x
> +mount -t sysfs /sys /sys
> +mount -t proc /proc /proc
> +mount -t xenfs -o nodev /proc/xen /proc/xen
> +
> +# TODO: Check if there is network for the vm before doing this
> +if test -e /sys/class/net/eth0; then
> +  ip link set eth0 address fe:ff:ff:ff:ff:fe
> +  ip addr flush eth0
> +  ip link set eth0 up
> +  brctl addbr br0
> +  brctl addif br0 eth0
> +  ip link set br0 up
> +else
> +  echo "No network interface named eth0."
> +  ls -l /sys/class/net/
> +fi
> +
> +# TODO Could probably to xenstore-read `xenstore-read vm`/image/dmargs
> +# because /local/domain/$domid is probably the root for relative path
> +domid=$(xenstore-read target)
> +dom_path="/local/domain/$domid"
> +vm_path=$(xenstore-read "$dom_path/vm")
> +dm_args=$(xenstore-read "$vm_path/image/dmargs")
> +
> +( sleep 30; free ) &
> +( sleep 60; free ) &
> +#( sleep 120; ip addr ) &
> +( sleep 120; free ) &

Erm....

> +free
> +/bin/qemu $dm_args
> diff --git a/stubdom-linux/extra/qemu-ifup b/stubdom-linux/extra/qemu-ifup
> new file mode 100644
> index 0000000..d71672b
> --- /dev/null
> +++ b/stubdom-linux/extra/qemu-ifup
> @@ -0,0 +1,7 @@
> +#! /bin/busybox sh
> +
> +ip link set "$1" down
> +ip link set "$1" address fe:ff:ff:ff:ff:fd
> +ip addr flush "$1"
> +brctl addif br0 "$1"
> +ip link set "$1" up

I don't think this will work for domains with multiple network devices.
e.g. if you want vifX.0 on xenbr0 and vifX.1 on xenbr0 in the dom0
backend this will cause them both to get put on the same bridge inside
the stubdom and therefore surface as a single device in dom0.

> diff --git a/stubdom-linux/mk-ramdisk-common b/stubdom-linux/mk-ramdisk-common
> new file mode 100755
> index 0000000..9a4a810
> --- /dev/null
> +++ b/stubdom-linux/mk-ramdisk-common
> @@ -0,0 +1,178 @@
> +#!/bin/bash
> +#
> +# This a simple implementaton of mkinitrd

                   implementation

> +
> +
> +# Set the umask. For iscsi, the initrd can contain platintext
> +# password (chap secret), so only allow read by owner.
> +umask 022
> +
> +TMPDIR="/tmp"
> +PROBE="yes"
> +MNTIMAGE="`pwd`/initramfs/"
> +IMAGE="./initramfs.cpio"
> +verbose=""
> +: ${debug:=false}
> +case $debug in
> +  true|false) ;;
> +  *)
> +    echo '$debug need to be true or false.'
> +    exit 1
> +    ;;
> +esac
> +$debug && verbose='-v'
> +
> +DSO_DEPS=""
> +LDSO=""
> +get_dso_deps() {
> +    bin="$1" ; shift
> +    DSO_DEPS=""
> +
> +    declare -a FILES
> +    declare -a NAMES
> +
> +    # this is a hack, but the only better way requires binutils or elfutils
> +    # be installed.  i.e., we need readelf to find the interpretter.

binutils will surely be installed while building Xen, won't it?

"interpreter"
> [...]

> +        case "$FILE" in
> +            /lib*)
> +                TLIBDIR=`echo "$FILE" | sed 's,\(/lib[^/]*\)/.*$,\1,'`
> +                BASE=`basename "$FILE"`
> +                # Prefer nosegneg libs over direct segment accesses on i686.
> +                if [ -f "$TLIBDIR/i686/nosegneg/$BASE" ]; then
> +                    FILE="$TLIBDIR/i686/nosegneg/$BASE"
> +                # Otherwise, prefer base libraries rather than their 
> optimized
> +                # variants.

Do we not want optimised e.g. SDL libraries if available, or other
libraries related to the provision of things like VNC which are the
sorts of instruction hungry stuff that I'd expect to benefit from
additional clever instructions...
[...]

> +# this cp keep the link to ld-2.11.x.so
> +if test "`uname -m`" = x86_64; then
> +  cp --no-dereference "/lib/ld-linux-x86-64.so.2" 
> "$MNTIMAGE/lib/ld-linux-x86-64.so.2"
> +else
> +  cp --no-dereference "/lib/ld-linux.so.2" "$MNTIMAGE/lib/ld-linux.so.2"
> +fi

Didn't you jump through some hoops earlier to find the actual dynamic
interpreter?

> +try_make_disk=true
> +if $try_make_disk; then
> +  stubdom_disk=stubdom-disk.img
> +  rm -f $stubdom_disk
> +  dd if=/dev/null of=$stubdom_disk bs=1M seek=40
> +  mkfs.ext2 -q -F -m0 $stubdom_disk
> +
> +  cd "$MNTIMAGE"
> +  stubdom_disk="../$stubdom_disk"
> +    new_link(){
> +      image=$1
> +      link=$2
> +      target=`readlink $link`
> +      dir=`dirname $link`
> +      dir=${dir#./}
> +      name_link=$(basename $link)
> +      dir_inode=$(debugfs -R "stat /$dir" $image 2>/dev/null |
> +        sed -nr 's/^Inode: ([[:digit:]]+)[[:space:]].*/\1/p')
> +      test "$dir_inode" || echo 'no dir inode found'
> +      test "$dir_inode"
> +      while true; do
> +        free_inode=$(debugfs -R "find_free_inode $dir_inode 0777" -w $image 
> 2>/dev/null |
> +        sed -nr 's/^Free inode found: ([[:digit:]]+)$/\1/p')
> +        undel_output="$(debugfs -R "undel <$free_inode> /$dir/$name_link" -w 
> $image 2>&1)"
> +        if grep -q "make_link: No free space in the directory" 
> <<<"$undel_output"; then
> +          debugfs -R "expand_dir /$dir" -w $image 2>/dev/null
> +        else
> +          break
> +        fi
> +      done
> +      debugfs -f <(
> +        echo "cd /$dir"
> +        echo "set_inode_field $name_link mode 0120777"
> +        echo "set_inode_field $name_link size ${#target}"
> +        # TODO still need to write the link into blocks
> +        if test ${#target} -lt $((12*4)); then
> +          # write into direct block
> +          blockn=0
> +          while test "$target"; do
> +            t="${target:0:4}"
> +            target="${target:4}"
> +            #convert a four charactere string into hexa
> +            val=$(printf '0x%02x%02x%02x%02x\n' \'${t:3:1} \'${t:2:1} 
> \'${t:1:1} \'${t:0:1})
> +            echo "set_inode_field $name_link block[$blockn] $val"
> +            blockn=$((blockn+1))
> +          done
> +        else
> +          # write into a block
> +          echo >&2 ".... write into block not implemented"
> +        fi
> +      ) -w $image >/dev/null 2>/dev/null
> +    }
> +  # TODO Should check for "copy_file: Could not allocate block in ext2 
> filesystem"
> +  debugfs -f <(find . \
> +    \( -type d \! -name . -printf 'cd /\nmkdir %h/%f\n' \) \
> +    -o \( -type f -printf 'cd /%h\nwrite %h/%f %f\n' \) \
> +    | sed -re 's%^((mkdir|cd) )./%\1/%' ) -w $stubdom_disk >/dev/null

This stuff all seems pretty exciting, but isn't it rather fragile
against differences in e2fstools versions etc?

I suppose you must have /usr/sbin and /sbin in your $PATH because none
of debugfs, mkfs.ext2 and fsck.ext2 appear in my $PATH...

> +  find . -type l | while read line; do
> +    new_link $stubdom_disk "$line"
> +  done
> +  fsck.ext2 -fy $stubdom_disk || true
> +  cd - >/dev/null
> +else
> +  (cd "$MNTIMAGE"; findall . | cpio -H newc --quiet -o) >| "$IMAGE" || exit 1
> +  gzip -f "$IMAGE"

This seems like it should be a much simpler option...

> +fi
> --
> Anthony PERARD
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.