[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC 3/6] linux-stubdomain: Build a disk image.



On 19/04/13 10:26, Ian Campbell wrote:
> On Wed, 2013-04-17 at 20:09 +0100, Anthony PERARD wrote:
>> This patch build a disk image intend to be mounted as rootfs by the
>> stub-domain. It is build using the 'debugfs' tool and make a ext2 fs.
>
> You seem to have some vestigial code for building a cpio style initramfs
> -- what was wrong with that approach? On the face of it that would seem
> simpler and less "hacky" than the tricks you have to play with debugfs.

The initramfs were taking more memory for the stub domain, I could not
go under 40MB, and now, with the disk, it's 34MB the min that the domain
his willing to run with.

> [...]
>> diff --git a/stubdom-linux/extra/initscript
b/stubdom-linux/extra/initscript
>> new file mode 100644
>> index 0000000..122892f
>> --- /dev/null
>> +++ b/stubdom-linux/extra/initscript
>> @@ -0,0 +1,40 @@
>> +#!/bin/busybox sh
>
> This can become the more normal /bin/sh if you put the appropriate
> symlink in the initrd?

If I recall correctly, I tryed the symlink, and it was not starting.
Maybe a hardlink would works.

>> +
>> +_initscript_panic() {
>> +  sleep 10
>> +}
>
> Erm...

Yes, that debug stuff which can be removed...

>> +trap _initscript_panic 0
>> +
>> +set -e
>> +set -x
>> +mount -t sysfs /sys /sys
>> +mount -t proc /proc /proc
>> +mount -t xenfs -o nodev /proc/xen /proc/xen
>> +
>> +# TODO: Check if there is network for the vm before doing this
>> +if test -e /sys/class/net/eth0; then
>> +  ip link set eth0 address fe:ff:ff:ff:ff:fe
>> +  ip addr flush eth0
>> +  ip link set eth0 up
>> +  brctl addbr br0
>> +  brctl addif br0 eth0
>> +  ip link set br0 up
>> +else
>> +  echo "No network interface named eth0."
>> +  ls -l /sys/class/net/
>> +fi
>> +
>> +# TODO Could probably to xenstore-read `xenstore-read vm`/image/dmargs
>> +# because /local/domain/$domid is probably the root for relative path
>> +domid=$(xenstore-read target)
>> +dom_path="/local/domain/$domid"
>> +vm_path=$(xenstore-read "$dom_path/vm")
>> +dm_args=$(xenstore-read "$vm_path/image/dmargs")
>> +
>> +( sleep 30; free ) &
>> +( sleep 60; free ) &
>> +#( sleep 120; ip addr ) &
>> +( sleep 120; free ) &
>
> Erm....

Same.

>> +free
>> +/bin/qemu $dm_args
>> diff --git a/stubdom-linux/extra/qemu-ifup
b/stubdom-linux/extra/qemu-ifup
>> new file mode 100644
>> index 0000000..d71672b
>> --- /dev/null
>> +++ b/stubdom-linux/extra/qemu-ifup
>> @@ -0,0 +1,7 @@
>> +#! /bin/busybox sh
>> +
>> +ip link set "$1" down
>> +ip link set "$1" address fe:ff:ff:ff:ff:fd
>> +ip addr flush "$1"
>> +brctl addif br0 "$1"
>> +ip link set "$1" up
>
> I don't think this will work for domains with multiple network devices.
> e.g. if you want vifX.0 on xenbr0 and vifX.1 on xenbr0 in the dom0
> backend this will cause them both to get put on the same bridge inside
> the stubdom and therefore surface as a single device in dom0.

OK, I will look into that.

>> diff --git a/stubdom-linux/mk-ramdisk-common
b/stubdom-linux/mk-ramdisk-common
>> new file mode 100755
>> index 0000000..9a4a810
>> --- /dev/null
>> +++ b/stubdom-linux/mk-ramdisk-common
>> @@ -0,0 +1,178 @@
>> +#!/bin/bash
>> +#
>> +# This a simple implementaton of mkinitrd
>
>                    implementation
>
>> +
>> +
>> +# Set the umask. For iscsi, the initrd can contain platintext
>> +# password (chap secret), so only allow read by owner.
>> +umask 022
>> +
>> +TMPDIR="/tmp"
>> +PROBE="yes"
>> +MNTIMAGE="`pwd`/initramfs/"
>> +IMAGE="./initramfs.cpio"
>> +verbose=""
>> +: ${debug:=false}
>> +case $debug in
>> +  true|false) ;;
>> +  *)
>> +    echo '$debug need to be true or false.'
>> +    exit 1
>> +    ;;
>> +esac
>> +$debug && verbose='-v'
>> +
>> +DSO_DEPS=""
>> +LDSO=""
>> +get_dso_deps() {
>> +    bin="$1" ; shift
>> +    DSO_DEPS=""
>> +
>> +    declare -a FILES
>> +    declare -a NAMES
>> +
>> +    # this is a hack, but the only better way requires binutils or
elfutils
>> +    # be installed.  i.e., we need readelf to find the interpretter.
>
> binutils will surely be installed while building Xen, won't it?

Yes, I will use those for the script.

> "interpreter"
>> [...]
>
>> +        case "$FILE" in
>> +            /lib*)
>> +                TLIBDIR=`echo "$FILE" | sed 's,\(/lib[^/]*\)/.*$,\1,'`
>> +                BASE=`basename "$FILE"`
>> +                # Prefer nosegneg libs over direct segment accesses
on i686.
>> +                if [ -f "$TLIBDIR/i686/nosegneg/$BASE" ]; then
>> +                    FILE="$TLIBDIR/i686/nosegneg/$BASE"
>> +                # Otherwise, prefer base libraries rather than their
optimized
>> +                # variants.
>
> Do we not want optimised e.g. SDL libraries if available, or other
> libraries related to the provision of things like VNC which are the
> sorts of instruction hungry stuff that I'd expect to benefit from
> additional clever instructions...
> [...]
>
>> +# this cp keep the link to ld-2.11.x.so
>> +if test "`uname -m`" = x86_64; then
>> +  cp --no-dereference "/lib/ld-linux-x86-64.so.2"
"$MNTIMAGE/lib/ld-linux-x86-64.so.2"
>> +else
>> +  cp --no-dereference "/lib/ld-linux.so.2" "$MNTIMAGE/lib/ld-linux.so.2"
>> +fi
>
> Didn't you jump through some hoops earlier to find the actual dynamic
> interpreter?

The script was not working for this lib.

>> +try_make_disk=true
>> +if $try_make_disk; then
>> +  stubdom_disk=stubdom-disk.img
>> +  rm -f $stubdom_disk
>> +  dd if=/dev/null of=$stubdom_disk bs=1M seek=40
>> +  mkfs.ext2 -q -F -m0 $stubdom_disk
>> +
>> +  cd "$MNTIMAGE"
>> +  stubdom_disk="../$stubdom_disk"
>> +    new_link(){
>> +      image=$1
>> +      link=$2
>> +      target=`readlink $link`
>> +      dir=`dirname $link`
>> +      dir=${dir#./}
>> +      name_link=$(basename $link)
>> +      dir_inode=$(debugfs -R "stat /$dir" $image 2>/dev/null |
>> +        sed -nr 's/^Inode: ([[:digit:]]+)[[:space:]].*/\1/p')
>> +      test "$dir_inode" || echo 'no dir inode found'
>> +      test "$dir_inode"
>> +      while true; do
>> +        free_inode=$(debugfs -R "find_free_inode $dir_inode 0777" -w
$image 2>/dev/null |
>> +        sed -nr 's/^Free inode found: ([[:digit:]]+)$/\1/p')
>> +        undel_output="$(debugfs -R "undel <$free_inode>
/$dir/$name_link" -w $image 2>&1)"
>> +        if grep -q "make_link: No free space in the directory"
<<<"$undel_output"; then
>> +          debugfs -R "expand_dir /$dir" -w $image 2>/dev/null
>> +        else
>> +          break
>> +        fi
>> +      done
>> +      debugfs -f <(
>> +        echo "cd /$dir"
>> +        echo "set_inode_field $name_link mode 0120777"
>> +        echo "set_inode_field $name_link size ${#target}"
>> +        # TODO still need to write the link into blocks
>> +        if test ${#target} -lt $((12*4)); then
>> +          # write into direct block
>> +          blockn=0
>> +          while test "$target"; do
>> +            t="${target:0:4}"
>> +            target="${target:4}"
>> +            #convert a four charactere string into hexa
>> +            val=$(printf '0x%02x%02x%02x%02x\n' \'${t:3:1}
\'${t:2:1} \'${t:1:1} \'${t:0:1})
>> +            echo "set_inode_field $name_link block[$blockn] $val"
>> +            blockn=$((blockn+1))
>> +          done
>> +        else
>> +          # write into a block
>> +          echo >&2 ".... write into block not implemented"
>> +        fi
>> +      ) -w $image >/dev/null 2>/dev/null
>> +    }
>> +  # TODO Should check for "copy_file: Could not allocate block in
ext2 filesystem"
>> +  debugfs -f <(find . \
>> +    \( -type d \! -name . -printf 'cd /\nmkdir %h/%f\n' \) \
>> +    -o \( -type f -printf 'cd /%h\nwrite %h/%f %f\n' \) \
>> +    | sed -re 's%^((mkdir|cd) )./%\1/%' ) -w $stubdom_disk >/dev/null
>
> This stuff all seems pretty exciting, but isn't it rather fragile
> against differences in e2fstools versions etc?

Yes, that was the only good enough way to create a disk image as user
(not root), with an utility present at least on debian.

> I suppose you must have /usr/sbin and /sbin in your $PATH because none
> of debugfs, mkfs.ext2 and fsck.ext2 appear in my $PATH...

Yes, I don't remove {,/usr}/sbin from my PATH.

>> +  find . -type l | while read line; do
>> +    new_link $stubdom_disk "$line"
>> +  done
>> +  fsck.ext2 -fy $stubdom_disk || true
>> +  cd - >/dev/null
>> +else
>> +  (cd "$MNTIMAGE"; findall . | cpio -H newc --quiet -o) >| "$IMAGE"
|| exit 1
>> +  gzip -f "$IMAGE"
>
> This seems like it should be a much simpler option...

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.