[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[XenPPC] Status of CoW on xenppc - part1 based on dm-userspace



Setting up CoW device on xenppc with dm-userspace:
!thanks to Dan Smith for the quick support for everything I struggled with on the way to get dm-userspace "working" so far! This is a howto, but because I was not able to get fully it up and running it is a description how to reach the bug I describe at the end of the document. I welcome every comment especially any help in deeper interpreting the DSISR/DAR/... registers in the bug statement information.

For now I switch part2 to test blktap based CoW device.

Step-by-Step dm-user based CoW
- get current dm-userspace http://static.danplanet.com/hg/. There are two versions now, I used unstable which sounded mor stable then ring ;) - merge the c files and headers with our linux-ppc tree and patch the kconfig file (install.sh was currently broken, but it would do the same)
- config the kernel with xen_maple_defconfig + dm-userspace support
- build the patched and configured kernel
Seen comile issues (code not fully platform independent?), but only warnings: /root/xen/linux/linux-ppc-2.6.hg_dmuserspace/drivers/md/dm-userspace-chardev.c: In function ‘do_kill_mapping’: /root/xen/linux/linux-ppc-2.6.hg_dmuserspace/drivers/md/dm-userspace-chardev.c:309: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 2 has type ‘uint64_t’
CC [M] drivers/md/dm-userspace-cache.o
/root/xen/linux/linux-ppc-2.6.hg_dmuserspace/drivers/md/dm-userspace-cache.c: In function ‘dmu_remove_mapping’: /root/xen/linux/linux-ppc-2.6.hg_dmuserspace/drivers/md/dm-userspace-cache.c:199: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 2 has type ‘uint64_t’
CC [M] drivers/md/dm-crypt.o

- build the current xen_unstable tip with the dm-userspace supporting zImage just created
- The Readme says compile libdmu, but ...
it fails building libdmu because our kernel is missing some patches, but neither current vanilla nor xen-unstable/sparse/pristine, nor the patch et delivered with dm-userspace contain that missing function, the needed patches where on the xen-devel list on Aug 2006 - where are they gone ???
Error compiling libdmu:
root@c08b01-0[1]:~/dm-userspace.unstable/tools/libdmu# gcc -DPACKAGE_NAME=\"libdmu\" -DPACKAGE_TARNAME=\"libdmu\" -DPACKAGE_VERSION=\"0.4.0\" "-DPACKAGE_STRING=\"libdmu 0.4.0\"" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE=\"libdmu\" -DVERSION=\"0.4.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DSTDC_HEADERS=1 -DHAVE_FCNTL_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_NETINET_IN_H=1 -DHAVE_STDINT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_SYS_IOCTL_H=1 -DHAVE_UNISTD_H=1 -DHAVE_STRUCT_STAT_ST_RDEV=1 -DHAVE_UNISTD_H=1 -DHAVE_FORK=1 -DHAVE_VFORK=1 -DHAVE_WORKING_VFORK=1 -DHAVE_WORKING_FORK=1 -DHAVE_STDLIB_H=1 -DHAVE_MALLOC=1 -DRETSIGTYPE=void -DLSTAT_FOLLOWS_SLASHED_SYMLINK=1 -DHAVE_MEMSET=1 -DHAVE_STRTOL=1 -DHAVE_STRTOULL=1 -I. -I. -g -I/lib/modules/2.6.17_dmuserspace-Xen/build/include -Wall -MT libdmu_la-dmu.lo -MD -MP -MF .deps/libdmu_la-dmu.Tpo -c dmu.c -fPIC -DPIC -o .libs/libdmu_la-dmu.o
dmu.c: In function ‘dmu_ctl_queue_msg’:
dmu.c:209: warning: implicit declaration of function ‘dmu_get_msg_len’
[...]

- after discussing with the maintainer Dan Smith I did the following:
-> switched to dm-userspace.ring instead of unstable
-> ignored libdmu because it is deprecated
- fails to compile the kernel with an error Dan Smith assumed to happen
-> switch back to changeset 613:d5d4d8eaa4a4 (last merge) using "hg revert --all -r d5d4d8eaa4a4" - compile kernel with dm-userspace of this changeset (still hast the %llu vs. uint64_t warning but worked so far)
- go to the tools/cowd dir and make && make install it
- create a dscow file for my base loop image with 64k blocks "dscow_tool -b 64 -c SLES10_G1.dscow SLES10_G1.img"
the file size (real) is in 4114 bytes for the dscow (yet unused)
4113 -rw------- 1 root root 4296081408 2007-05-15 14:38 SLES10_G1.dscow
2465165 -rw-r--r-- 1 root root 4296015872 2007-05-04 10:16 SLES10_G1.img
while both are 4.1G sparse files (the image was created with dd to 4G size as sparse to increase the real block usage on demand)

creating a device-mapper dev node for the file with:
root@c08b01-0[1]:~/images# cowd -n -v -d -p dscow SLES10_G1 SLES10_G1.dscow
Daemon Configuration:
Plugin: dscow
Daemon: no
Init CoW: no
Verbose: yes
Block Size: 0 KB
Init device:yes
Adding plugin arg 0/2: SLES10_G1
Adding plugin arg 1/2: SLES10_G1.dscow
cowd[4945]: Starting
cowd[4945]: Loaded /usr/local/lib/libcowd_dscow.so
ioctl: LOOP_SET_FD: Device or resource busy
Device SLES10_G1: 0 blocks @ 65552 KB
Creating dm device: SLES10_G1 0 7:0 7:1
device-mapper: create ioctl failed: Device or resource busy
Failed to run device-mapper command!
Failed to create DM device

Try to get it up with cowmount:
root@c08b01-0[0]:~/images# cowmount SLES10_G1.dscow /mnt/SLES10_G1
ioctl: LOOP_SET_FD: Device or resource busy
device-mapper: create ioctl failed: Device or resource busy
Failed to run device-mapper command!
Failed to create DM device
Failed to start cowd

dmesg:
device-mapper: table: 254:0: userspace: unknown target type
device-mapper: ioctl: error adding target to table

Reason: dm-mod/dm-user have no autoload from cowd
-> always load them manually

When the module is loaded the mounting cowmount as well as the load with
cowd and mount /dev/mapper/device afterwards fail both with this:
In a xen image with debug=y & nosmp I also get this, well it's a linux bug so
it was expected that way, but it removed muli-cpu from the candiate list of
bug origin.

cpu 0x0: Vector: 300 (Data Access) at [c0000000188878b0]
pc: d0000000002cda40: .run_pages_job+0xd0/0x150 [dm_mod]
lr: d0000000002cd9bc: .run_pages_job+0x4c/0x150 [dm_mod]
sp: c000000018887b30
msr: 8000000000009032
dar: 0
dsisr: 40000000
current = 0xc000000003dff800
paca = 0xc0000000005e4100
pid = 4730, comm = kcopyd
enter ? for help
0:mon>

As far as I can read this dump it is a branch to 0x300 on cpu 0
0x300 is "Data Storage interrupt" and DSISR/DAR should say something about the
reason.
DSISR[33]=1 means:
Set to 1 if MSRDR=1 and the translation for
an attempted access is not found in the pri-
mary PTEG or in the secondary PTEG; oth-
erwise set to 0.
DAR is zero. I do not claim to have understood all about that in the PowerISA
document all I would assume now may be wrong. I hope someone reading this
continue interpreting here.

Base of .run_pages_job is 0x9970 so calculate pc/lr
this is dissassemble a part of .run_pages_job at 9a40 (pointed to by pc)
9a40: e9 6b 00 00 ld r11,0(r11)
9a44: 42 00 ff fc bdnz+ 9a40 <.run_pages_job+0xd0>
The lr target is lso in .run_pages_job at 99bc (pointed to by lr)
99bc: 60 00 00 00 nop
99c0: 81 3f 00 24 lwz r9,36(r31)

C-Code of this function (part of kcopyd.c):
360 static int run_pages_job(struct kcopyd_job *job)
361 {
362 int r;
363
364 job->nr_pages = dm_div_up(job->dests[0].count + job->offset,
365 PAGE_SIZE >> 9);
366 r = kcopyd_get_pages(job->kc, job->nr_pages, &job->pages);
367 if (!r) {
368 /* this job is ready for io */
369 push(&_io_jobs, job);
370 return 0;
371 }
372
373 if (r == -ENOMEM)
374 /* can't complete now */
375 return 1;
376
377 return r;
378 }

Full disassemble of this function, the mere size indicates vs. c-code seems
like something got inlined. The code around the bug is a loop based on ctr.
run_pages_job contains no loop (and nothing that would make sense to autoconvert it while compiling, but the called kcopyd_get_pages contains such a loop that
may be a candidate. Also kcopyd_get_pages is not there as function in the
disassembly which let me assume that it got completely inlined here:
0000000000009970 <.run_pages_job>:
9970: 7c 08 02 a6 mflr r0
9974: fb 81 ff e0 std r28,-32(r1)
9978: fb a1 ff e8 std r29,-24(r1)
997c: fb c1 ff f0 std r30,-16(r1)
9980: fb e1 ff f8 std r31,-8(r1)
9984: eb c2 00 00 ld r30,0(r2)
9988: f8 01 00 10 std r0,16(r1)
998c: f8 21 ff 71 stdu r1,-144(r1)
9990: 7c 7c 1b 78 mr r28,r3
9994: 60 00 00 00 nop
9998: e9 23 00 60 ld r9,96(r3)
999c: e8 03 01 10 ld r0,272(r3)
99a0: eb e3 00 00 ld r31,0(r3)
99a4: 39 29 00 07 addi r9,r9,7
99a8: 38 7f 00 10 addi r3,r31,16
99ac: 7d 29 02 14 add r9,r9,r0
99b0: 79 3d e8 22 rldicl r29,r9,61,32
99b4: 93 bc 01 18 stw r29,280(r28)
99b8: 48 00 00 01 bl 99b8 <.run_pages_job+0x48>
99bc: 60 00 00 00 nop
99c0: 81 3f 00 24 lwz r9,36(r31)
99c4: 7f 9d 48 40 cmplw cr7,r29,r9
99c8: 40 9d 00 48 ble- cr7,9a10 <.run_pages_job+0xa0>
99cc: 7c 20 04 ac lwsync
99d0: 38 00 00 00 li r0,0
99d4: 38 21 00 90 addi r1,r1,144
99d8: 38 60 00 01 li r3,1
99dc: 90 1f 00 10 stw r0,16(r31)
99e0: 60 00 00 00 nop
99e4: 60 00 00 00 nop
99e8: 60 00 00 00 nop
99ec: e8 01 00 10 ld r0,16(r1)
99f0: eb 81 ff e0 ld r28,-32(r1)
99f4: eb a1 ff e8 ld r29,-24(r1)
99f8: eb c1 ff f0 ld r30,-16(r1)
99fc: eb e1 ff f8 ld r31,-8(r1)
9a00: 7c 08 03 a6 mtlr r0
9a04: 4e 80 00 20 blr
9a08: 60 00 00 00 nop
9a0c: 60 00 00 00 nop
9a10: 38 1d ff ff addi r0,r29,-1
9a14: e9 7f 00 18 ld r11,24(r31)
9a18: 7d 3d 48 50 subf r9,r29,r9
9a1c: 78 0a 00 20 clrldi r10,r0,32
9a20: 91 3f 00 24 stw r9,36(r31)
9a24: 2f aa 00 00 cmpdi cr7,r10,0
9a28: f9 7c 01 20 std r11,288(r28)
9a2c: 41 9e 00 1c beq- cr7,9a48 <.run_pages_job+0xd8>
9a30: 39 2a ff ff addi r9,r10,-1
9a34: 79 29 00 20 clrldi r9,r9,32
9a38: 39 29 00 01 addi r9,r9,1
9a3c: 7d 29 03 a6 mtctr r9
9a40: e9 6b 00 00 ld r11,0(r11)
9a44: 42 00 ff fc bdnz+ 9a40 <.run_pages_job+0xd0>
9a48: e9 2b 00 00 ld r9,0(r11)
9a4c: 38 00 00 00 li r0,0
9a50: f9 3f 00 18 std r9,24(r31)
9a54: f8 0b 00 00 std r0,0(r11)
9a58: 7c 20 04 ac lwsync
9a5c: 90 1f 00 10 stw r0,16(r31)
9a60: eb be 80 50 ld r29,-32688(r30)
9a64: 7f a3 eb 78 mr r3,r29
9a68: 48 00 00 01 bl 9a68 <.run_pages_job+0xf8>
9a6c: 60 00 00 00 nop
9a70: e9 3e 80 08 ld r9,-32760(r30)
9a74: 39 7c 00 08 addi r11,r28,8
9a78: 7c 64 1b 78 mr r4,r3
9a7c: 7f a3 eb 78 mr r3,r29
9a80: e9 49 00 08 ld r10,8(r9)
9a84: f9 3c 00 08 std r9,8(r28)
9a88: f9 69 00 08 std r11,8(r9)
9a8c: f9 6a 00 00 std r11,0(r10)
9a90: f9 4b 00 08 std r10,8(r11)
9a94: 48 00 00 01 bl 9a94 <.run_pages_job+0x124>
9a98: 60 00 00 00 nop
9a9c: 38 21 00 90 addi r1,r1,144
9aa0: 38 60 00 00 li r3,0
9aa4: e8 01 00 10 ld r0,16(r1)
9aa8: eb 81 ff e0 ld r28,-32(r1)
9aac: eb a1 ff e8 ld r29,-24(r1)
9ab0: eb c1 ff f0 ld r30,-16(r1)
9ab4: eb e1 ff f8 ld r31,-8(r1)
9ab8: 7c 08 03 a6 mtlr r0
9abc: 4e 80 00 20 blr

--

Grüsse / regards, Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization
+49 7031/16-3385
Ehrhardt@xxxxxxxxxxxxxxxxxxx
Ehrhardt@xxxxxxxxxx

IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Johann Weihen Geschäftsführung: Herbert Kircher Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.