[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [XenPPC] Status of CoW on xenppc - part1 based on dm-userspace
Setting up CoW device on xenppc with dm-userspace:!thanks to Dan Smith for the quick support for everything I struggled with on the way to get dm-userspace "working" so far! This is a howto, but because I was not able to get fully it up and running it is a description how to reach the bug I describe at the end of the document. I welcome every comment especially any help in deeper interpreting the DSISR/DAR/... registers in the bug statement information. For now I switch part2 to test blktap based CoW device. Step-by-Step dm-user based CoW- get current dm-userspace http://static.danplanet.com/hg/. There are two versions now, I used unstable which sounded mor stable then ring ;) - merge the c files and headers with our linux-ppc tree and patch the kconfig file (install.sh was currently broken, but it would do the same) - config the kernel with xen_maple_defconfig + dm-userspace support - build the patched and configured kernelSeen comile issues (code not fully platform independent?), but only warnings: /root/xen/linux/linux-ppc-2.6.hg_dmuserspace/drivers/md/dm-userspace-chardev.c: In function ‘do_kill_mapping’: /root/xen/linux/linux-ppc-2.6.hg_dmuserspace/drivers/md/dm-userspace-chardev.c:309: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 2 has type ‘uint64_t’ CC [M] drivers/md/dm-userspace-cache.o/root/xen/linux/linux-ppc-2.6.hg_dmuserspace/drivers/md/dm-userspace-cache.c: In function ‘dmu_remove_mapping’: /root/xen/linux/linux-ppc-2.6.hg_dmuserspace/drivers/md/dm-userspace-cache.c:199: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 2 has type ‘uint64_t’ CC [M] drivers/md/dm-crypt.o- build the current xen_unstable tip with the dm-userspace supporting zImage just created - The Readme says compile libdmu, but ...it fails building libdmu because our kernel is missing some patches, but neither current vanilla nor xen-unstable/sparse/pristine, nor the patch et delivered with dm-userspace contain that missing function, the needed patches where on the xen-devel list on Aug 2006 - where are they gone ??? Error compiling libdmu:root@c08b01-0[1]:~/dm-userspace.unstable/tools/libdmu# gcc -DPACKAGE_NAME=\"libdmu\" -DPACKAGE_TARNAME=\"libdmu\" -DPACKAGE_VERSION=\"0.4.0\" "-DPACKAGE_STRING=\"libdmu 0.4.0\"" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE=\"libdmu\" -DVERSION=\"0.4.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DSTDC_HEADERS=1 -DHAVE_FCNTL_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_NETINET_IN_H=1 -DHAVE_STDINT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_SYS_IOCTL_H=1 -DHAVE_UNISTD_H=1 -DHAVE_STRUCT_STAT_ST_RDEV=1 -DHAVE_UNISTD_H=1 -DHAVE_FORK=1 -DHAVE_VFORK=1 -DHAVE_WORKING_VFORK=1 -DHAVE_WORKING_FORK=1 -DHAVE_STDLIB_H=1 -DHAVE_MALLOC=1 -DRETSIGTYPE=void -DLSTAT_FOLLOWS_SLASHED_SYMLINK=1 -DHAVE_MEMSET=1 -DHAVE_STRTOL=1 -DHAVE_STRTOULL=1 -I. -I. -g -I/lib/modules/2.6.17_dmuserspace-Xen/build/include -Wall -MT libdmu_la-dmu.lo -MD -MP -MF .deps/libdmu_la-dmu.Tpo -c dmu.c -fPIC -DPIC -o .libs/libdmu_la-dmu.o dmu.c: In function ‘dmu_ctl_queue_msg’: dmu.c:209: warning: implicit declaration of function ‘dmu_get_msg_len’ [...] - after discussing with the maintainer Dan Smith I did the following: -> switched to dm-userspace.ring instead of unstable -> ignored libdmu because it is deprecated - fails to compile the kernel with an error Dan Smith assumed to happen-> switch back to changeset 613:d5d4d8eaa4a4 (last merge) using "hg revert --all -r d5d4d8eaa4a4" - compile kernel with dm-userspace of this changeset (still hast the %llu vs. uint64_t warning but worked so far) - go to the tools/cowd dir and make && make install it- create a dscow file for my base loop image with 64k blocks "dscow_tool -b 64 -c SLES10_G1.dscow SLES10_G1.img" the file size (real) is in 4114 bytes for the dscow (yet unused) 4113 -rw------- 1 root root 4296081408 2007-05-15 14:38 SLES10_G1.dscow 2465165 -rw-r--r-- 1 root root 4296015872 2007-05-04 10:16 SLES10_G1.imgwhile both are 4.1G sparse files (the image was created with dd to 4G size as sparse to increase the real block usage on demand) creating a device-mapper dev node for the file with: root@c08b01-0[1]:~/images# cowd -n -v -d -p dscow SLES10_G1 SLES10_G1.dscow Daemon Configuration: Plugin: dscow Daemon: no Init CoW: no Verbose: yes Block Size: 0 KB Init device:yes Adding plugin arg 0/2: SLES10_G1 Adding plugin arg 1/2: SLES10_G1.dscow cowd[4945]: Starting cowd[4945]: Loaded /usr/local/lib/libcowd_dscow.so ioctl: LOOP_SET_FD: Device or resource busy Device SLES10_G1: 0 blocks @ 65552 KB Creating dm device: SLES10_G1 0 7:0 7:1 device-mapper: create ioctl failed: Device or resource busy Failed to run device-mapper command! Failed to create DM device Try to get it up with cowmount: root@c08b01-0[0]:~/images# cowmount SLES10_G1.dscow /mnt/SLES10_G1 ioctl: LOOP_SET_FD: Device or resource busy device-mapper: create ioctl failed: Device or resource busy Failed to run device-mapper command! Failed to create DM device Failed to start cowd dmesg: device-mapper: table: 254:0: userspace: unknown target type device-mapper: ioctl: error adding target to table Reason: dm-mod/dm-user have no autoload from cowd -> always load them manually When the module is loaded the mounting cowmount as well as the load with cowd and mount /dev/mapper/device afterwards fail both with this:In a xen image with debug=y & nosmp I also get this, well it's a linux bug so it was expected that way, but it removed muli-cpu from the candiate list of bug origin. cpu 0x0: Vector: 300 (Data Access) at [c0000000188878b0] pc: d0000000002cda40: .run_pages_job+0xd0/0x150 [dm_mod] lr: d0000000002cd9bc: .run_pages_job+0x4c/0x150 [dm_mod] sp: c000000018887b30 msr: 8000000000009032 dar: 0 dsisr: 40000000 current = 0xc000000003dff800 paca = 0xc0000000005e4100 pid = 4730, comm = kcopyd enter ? for help 0:mon> As far as I can read this dump it is a branch to 0x300 on cpu 00x300 is "Data Storage interrupt" and DSISR/DAR should say something about the reason. DSISR[33]=1 means: Set to 1 if MSRDR=1 and the translation for an attempted access is not found in the pri- mary PTEG or in the secondary PTEG; oth- erwise set to 0.DAR is zero. I do not claim to have understood all about that in the PowerISA document all I would assume now may be wrong. I hope someone reading this continue interpreting here. Base of .run_pages_job is 0x9970 so calculate pc/lr this is dissassemble a part of .run_pages_job at 9a40 (pointed to by pc) 9a40: e9 6b 00 00 ld r11,0(r11) 9a44: 42 00 ff fc bdnz+ 9a40 <.run_pages_job+0xd0> The lr target is lso in .run_pages_job at 99bc (pointed to by lr) 99bc: 60 00 00 00 nop 99c0: 81 3f 00 24 lwz r9,36(r31) C-Code of this function (part of kcopyd.c): 360 static int run_pages_job(struct kcopyd_job *job) 361 { 362 int r; 363 364 job->nr_pages = dm_div_up(job->dests[0].count + job->offset, 365 PAGE_SIZE >> 9); 366 r = kcopyd_get_pages(job->kc, job->nr_pages, &job->pages); 367 if (!r) { 368 /* this job is ready for io */ 369 push(&_io_jobs, job); 370 return 0; 371 } 372 373 if (r == -ENOMEM) 374 /* can't complete now */ 375 return 1; 376 377 return r; 378 } Full disassemble of this function, the mere size indicates vs. c-code seems like something got inlined. The code around the bug is a loop based on ctr.run_pages_job contains no loop (and nothing that would make sense to autoconvert it while compiling, but the called kcopyd_get_pages contains such a loop that may be a candidate. Also kcopyd_get_pages is not there as function in the disassembly which let me assume that it got completely inlined here: 0000000000009970 <.run_pages_job>: 9970: 7c 08 02 a6 mflr r0 9974: fb 81 ff e0 std r28,-32(r1) 9978: fb a1 ff e8 std r29,-24(r1) 997c: fb c1 ff f0 std r30,-16(r1) 9980: fb e1 ff f8 std r31,-8(r1) 9984: eb c2 00 00 ld r30,0(r2) 9988: f8 01 00 10 std r0,16(r1) 998c: f8 21 ff 71 stdu r1,-144(r1) 9990: 7c 7c 1b 78 mr r28,r3 9994: 60 00 00 00 nop 9998: e9 23 00 60 ld r9,96(r3) 999c: e8 03 01 10 ld r0,272(r3) 99a0: eb e3 00 00 ld r31,0(r3) 99a4: 39 29 00 07 addi r9,r9,7 99a8: 38 7f 00 10 addi r3,r31,16 99ac: 7d 29 02 14 add r9,r9,r0 99b0: 79 3d e8 22 rldicl r29,r9,61,32 99b4: 93 bc 01 18 stw r29,280(r28) 99b8: 48 00 00 01 bl 99b8 <.run_pages_job+0x48> 99bc: 60 00 00 00 nop 99c0: 81 3f 00 24 lwz r9,36(r31) 99c4: 7f 9d 48 40 cmplw cr7,r29,r9 99c8: 40 9d 00 48 ble- cr7,9a10 <.run_pages_job+0xa0> 99cc: 7c 20 04 ac lwsync 99d0: 38 00 00 00 li r0,0 99d4: 38 21 00 90 addi r1,r1,144 99d8: 38 60 00 01 li r3,1 99dc: 90 1f 00 10 stw r0,16(r31) 99e0: 60 00 00 00 nop 99e4: 60 00 00 00 nop 99e8: 60 00 00 00 nop 99ec: e8 01 00 10 ld r0,16(r1) 99f0: eb 81 ff e0 ld r28,-32(r1) 99f4: eb a1 ff e8 ld r29,-24(r1) 99f8: eb c1 ff f0 ld r30,-16(r1) 99fc: eb e1 ff f8 ld r31,-8(r1) 9a00: 7c 08 03 a6 mtlr r0 9a04: 4e 80 00 20 blr 9a08: 60 00 00 00 nop 9a0c: 60 00 00 00 nop 9a10: 38 1d ff ff addi r0,r29,-1 9a14: e9 7f 00 18 ld r11,24(r31) 9a18: 7d 3d 48 50 subf r9,r29,r9 9a1c: 78 0a 00 20 clrldi r10,r0,32 9a20: 91 3f 00 24 stw r9,36(r31) 9a24: 2f aa 00 00 cmpdi cr7,r10,0 9a28: f9 7c 01 20 std r11,288(r28) 9a2c: 41 9e 00 1c beq- cr7,9a48 <.run_pages_job+0xd8> 9a30: 39 2a ff ff addi r9,r10,-1 9a34: 79 29 00 20 clrldi r9,r9,32 9a38: 39 29 00 01 addi r9,r9,1 9a3c: 7d 29 03 a6 mtctr r9 9a40: e9 6b 00 00 ld r11,0(r11) 9a44: 42 00 ff fc bdnz+ 9a40 <.run_pages_job+0xd0> 9a48: e9 2b 00 00 ld r9,0(r11) 9a4c: 38 00 00 00 li r0,0 9a50: f9 3f 00 18 std r9,24(r31) 9a54: f8 0b 00 00 std r0,0(r11) 9a58: 7c 20 04 ac lwsync 9a5c: 90 1f 00 10 stw r0,16(r31) 9a60: eb be 80 50 ld r29,-32688(r30) 9a64: 7f a3 eb 78 mr r3,r29 9a68: 48 00 00 01 bl 9a68 <.run_pages_job+0xf8> 9a6c: 60 00 00 00 nop 9a70: e9 3e 80 08 ld r9,-32760(r30) 9a74: 39 7c 00 08 addi r11,r28,8 9a78: 7c 64 1b 78 mr r4,r3 9a7c: 7f a3 eb 78 mr r3,r29 9a80: e9 49 00 08 ld r10,8(r9) 9a84: f9 3c 00 08 std r9,8(r28) 9a88: f9 69 00 08 std r11,8(r9) 9a8c: f9 6a 00 00 std r11,0(r10) 9a90: f9 4b 00 08 std r10,8(r11) 9a94: 48 00 00 01 bl 9a94 <.run_pages_job+0x124> 9a98: 60 00 00 00 nop 9a9c: 38 21 00 90 addi r1,r1,144 9aa0: 38 60 00 00 li r3,0 9aa4: e8 01 00 10 ld r0,16(r1) 9aa8: eb 81 ff e0 ld r28,-32(r1) 9aac: eb a1 ff e8 ld r29,-24(r1) 9ab0: eb c1 ff f0 ld r30,-16(r1) 9ab4: eb e1 ff f8 ld r31,-8(r1) 9ab8: 7c 08 03 a6 mtlr r0 9abc: 4e 80 00 20 blr --Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization +49 7031/16-3385 Ehrhardt@xxxxxxxxxxxxxxxxxxx Ehrhardt@xxxxxxxxxx IBM Deutschland Entwicklung GmbHVorsitzender des Aufsichtsrats: Johann Weihen Geschäftsführung: Herbert Kircher Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 _______________________________________________ Xen-ppc-devel mailing list Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ppc-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |