[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] xen-gntdev gets stuck unmapping a 2nd page
Hi, I've been playing more with vchan and I think I've hit a bug in xen-gntdev. It works ok if I map/unmap in a single page. If I increase the buffer size and cause vchan to map a second page (non-contiguous mappings via 2 distinct libxc calls), the process gets stuck in disk sleep when unmapping the second page. The same happens if I comment out the unmap and let the program exit. I'm using a 3.13 kernel: $ uname -a Linux ubuntu 3.13.0-24-generic #46 SMP Thu Aug 28 23:05:20 BST 2014 x86_64 x86_64 x86_64 GNU/Linux Although I'm testing vchan between two domUs, I believe this repros it with loopback grants within a single domU: $ cat > test-gnt.c <<EOT #include <xenctrl.h> #include <stdint.h> #include <inttypes.h> #include <stdio.h> #include <stdlib.h> #include <sys/mman.h> void main(int argc, char* argv[]){ int my_domid = atoi(argv[1]); uint32_t refs1, refs2; void *share1, *share2, *map1, *map2; int count = 1; xc_gntshr *xshr = xc_gntshr_open(NULL, 0); xc_gnttab *xtab = xc_gnttab_open(NULL, 0); if (!xshr || !xtab) goto fail; share1 = xc_gntshr_share_pages(xshr, my_domid, count, &refs1, 1); share2 = xc_gntshr_share_pages(xshr, my_domid, count, &refs2, 1); if (!share1 || !share2) goto fail; map1 = xc_gnttab_map_grant_ref(xtab, my_domid, refs1, PROT_READ); map2 = xc_gnttab_map_grant_ref(xtab, my_domid, refs2, PROT_READ); fprintf(stderr, "src=%p ref=%"PRIu32" dest=%p\n", share1, refs1, map1); fprintf(stderr, "src=%p ref=%"PRIu32" dest=%p\n", share2, refs2, map2); xc_gnttab_munmap(xtab, map2, count); fprintf(stderr, "Unmapped first page\n"); fflush(stderr); /* This call never completes: */ xc_gnttab_munmap(xtab, map1, count); fprintf(stderr, "Unmapped second page\n"); fflush(stderr); exit(0); fail: perror(NULL); exit(1); } EOT $ gcc -o test-gnt test-gnt.c -lxenctrl $ sudo ./test-gnt $(sudo xenstore-read domid) src=0x7f74080bd000 ref=54 dest=0x7f74080bb000 src=0x7f74080bc000 ref=85 dest=0x7f74080ba000 Unmapped first page <now it's dead> From the logs it looks like free_xenballooned_pages blocks forever: [ 720.176089] INFO: task kworker/0:1:27 blocked for more than 120 seconds. [ 720.176101] Not tainted 3.13.0-24-generic #46 [ 720.176105] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 720.176110] kworker/0:1 D 0000000000000000 0 27 2 0x00000000 [ 720.176118] Workqueue: events balloon_process [ 720.176120] ffff88003db85b20 0000000000000202 ffff88003db597f0 ffff88003db85fd8 [ 720.176123] 0000000000014440 0000000000014440 ffff88003db597f0 ffff88003c6cfc98 [ 720.176125] ffff88003c6cfc9c ffff88003db597f0 00000000ffffffff ffff88003c6cfca0 [ 720.176127] Call Trace: [ 720.176133] [<ffffffff8171a3a9>] schedule_preempt_disabled+0x29/0x70 [ 720.176136] [<ffffffff8171c215>] __mutex_lock_slowpath+0x135/0x1b0 [ 720.176139] [<ffffffff8171c2af>] mutex_lock+0x1f/0x2f [ 720.176142] [<ffffffff8148e92d>] device_attach+0x1d/0xa0 [ 720.176145] [<ffffffff8148dda8>] bus_probe_device+0x98/0xc0 [ 720.176147] [<ffffffff8148bc05>] device_add+0x4c5/0x640 [ 720.176149] [<ffffffff8148bd9a>] device_register+0x1a/0x20 [ 720.176158] [<ffffffff814a2370>] init_memory_block+0xd0/0xf0 [ 720.176161] [<ffffffff814a24b1>] register_new_memory+0x91/0xa0 [ 720.176164] [<ffffffff81705de0>] __add_pages+0x140/0x240 [ 720.176167] [<ffffffff81055649>] arch_add_memory+0x59/0xd0 [ 720.176170] [<ffffffff817060b4>] add_memory+0xe4/0x1f0 [ 720.176172] [<ffffffff8142c2d2>] balloon_process+0x382/0x420 [ 720.176175] [<ffffffff810838a2>] process_one_work+0x182/0x450 [ 720.176178] [<ffffffff81084641>] worker_thread+0x121/0x410 [ 720.176180] [<ffffffff81084520>] ? rescuer_thread+0x3e0/0x3e0 [ 720.176183] [<ffffffff8108b312>] kthread+0xd2/0xf0 [ 720.176185] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0 [ 720.176188] [<ffffffff8172637c>] ret_from_fork+0x7c/0xb0 [ 720.176190] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0 [ 720.176197] INFO: task test-gnt:957 blocked for more than 120 seconds. [ 720.176203] Not tainted 3.13.0-24-generic #46 [ 720.176206] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 720.176210] test-gnt D 0000000000000000 0 957 954 0x00000001 [ 720.176213] ffff8800053f3d70 0000000000000202 ffff88003a3847d0 ffff8800053f3fd8 [ 720.176215] 0000000000014440 0000000000014440 ffff88003a3847d0 ffffffff81c99740 [ 720.176217] ffffffff81c99744 ffff88003a3847d0 00000000ffffffff ffffffff81c99748 [ 720.176219] Call Trace: [ 720.176222] [<ffffffff8171a3a9>] schedule_preempt_disabled+0x29/0x70 [ 720.176224] [<ffffffff8171c215>] __mutex_lock_slowpath+0x135/0x1b0 [ 720.176226] [<ffffffff8171c2af>] mutex_lock+0x1f/0x2f [ 720.176229] [<ffffffff8142bb4e>] free_xenballooned_pages+0x1e/0x90 [ 720.176236] [<ffffffffa000c586>] gntdev_free_map+0x26/0x60 [xen_gntdev] [ 720.176238] [<ffffffffa000c6f8>] gntdev_put_map+0xa8/0x100 [xen_gntdev] [ 720.176241] [<ffffffffa000d2e2>] gntdev_ioctl+0x442/0x750 [xen_gntdev] [ 720.176245] [<ffffffff811cc6e0>] do_vfs_ioctl+0x2e0/0x4c0 [ 720.176248] [<ffffffff81079e42>] ? ptrace_notify+0x82/0xc0 [ 720.176250] [<ffffffff811cc941>] SyS_ioctl+0x81/0xa0 [ 720.176252] [<ffffffff8172663f>] tracesys+0xe1/0xe6 [ 720.176254] INFO: task systemd-udevd:958 blocked for more than 120 seconds. [ 720.176258] Not tainted 3.13.0-24-generic #46 [ 720.176261] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 720.176265] systemd-udevd D 0000000000000000 0 958 269 0x00000004 [ 720.176267] ffff88003bb0dd20 0000000000000202 ffff88003a382fe0 ffff88003bb0dfd8 [ 720.176269] 0000000000014440 0000000000014440 ffff88003a382fe0 ffffffff81c62040 [ 720.176271] ffffffff81c62044 ffff88003a382fe0 00000000ffffffff ffffffff81c62048 [ 720.176273] Call Trace: [ 720.176276] [<ffffffff8171a3a9>] schedule_preempt_disabled+0x29/0x70 [ 720.176278] [<ffffffff8171c215>] __mutex_lock_slowpath+0x135/0x1b0 [ 720.176280] [<ffffffff8171c2af>] mutex_lock+0x1f/0x2f [ 720.176282] [<ffffffff81706ab3>] online_pages+0x33/0x570 [ 720.176285] [<ffffffff814a2108>] memory_subsys_online+0x68/0xd0 [ 720.176287] [<ffffffff8148c555>] device_online+0x65/0x90 [ 720.176289] [<ffffffff814a1d94>] store_mem_state+0x64/0x160 [ 720.176291] [<ffffffff81489ab8>] dev_attr_store+0x18/0x30 [ 720.176295] [<ffffffff8122f418>] sysfs_write_file+0x128/0x1c0 [ 720.176297] [<ffffffff811b9534>] vfs_write+0xb4/0x1f0 [ 720.176300] [<ffffffff811b9f69>] SyS_write+0x49/0xa0 [ 720.176302] [<ffffffff8172663f>] tracesys+0xe1/0xe6 Presumably the same problem would affect a userspace block backend using grant mapping, like qemu disk? Or maybe I’m just doing it wrong (always possible!) :-) Cheers, Dave _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |