|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] xen-gntdev gets stuck unmapping a 2nd page
Hi,
I've been playing more with vchan and I think I've hit a bug in xen-gntdev.
It works ok if I map/unmap in a single page. If I increase the buffer size
and cause vchan to map a second page (non-contiguous mappings via 2 distinct
libxc calls), the process gets stuck in disk sleep when unmapping the second
page. The same happens if I comment out the unmap and let the program exit.
I'm using a 3.13 kernel:
$ uname -a
Linux ubuntu 3.13.0-24-generic #46 SMP Thu Aug 28 23:05:20 BST 2014 x86_64
x86_64 x86_64 GNU/Linux
Although I'm testing vchan between two domUs, I believe this repros it
with loopback grants within a single domU:
$ cat > test-gnt.c <<EOT
#include <xenctrl.h>
#include <stdint.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
void main(int argc, char* argv[]){
int my_domid = atoi(argv[1]);
uint32_t refs1, refs2;
void *share1, *share2, *map1, *map2;
int count = 1;
xc_gntshr *xshr = xc_gntshr_open(NULL, 0);
xc_gnttab *xtab = xc_gnttab_open(NULL, 0);
if (!xshr || !xtab)
goto fail;
share1 = xc_gntshr_share_pages(xshr, my_domid, count, &refs1, 1);
share2 = xc_gntshr_share_pages(xshr, my_domid, count, &refs2, 1);
if (!share1 || !share2)
goto fail;
map1 = xc_gnttab_map_grant_ref(xtab, my_domid, refs1, PROT_READ);
map2 = xc_gnttab_map_grant_ref(xtab, my_domid, refs2, PROT_READ);
fprintf(stderr, "src=%p ref=%"PRIu32" dest=%p\n", share1, refs1, map1);
fprintf(stderr, "src=%p ref=%"PRIu32" dest=%p\n", share2, refs2, map2);
xc_gnttab_munmap(xtab, map2, count);
fprintf(stderr, "Unmapped first page\n"); fflush(stderr);
/* This call never completes: */
xc_gnttab_munmap(xtab, map1, count);
fprintf(stderr, "Unmapped second page\n"); fflush(stderr);
exit(0);
fail:
perror(NULL);
exit(1);
}
EOT
$ gcc -o test-gnt test-gnt.c -lxenctrl
$ sudo ./test-gnt $(sudo xenstore-read domid)
src=0x7f74080bd000 ref=54 dest=0x7f74080bb000
src=0x7f74080bc000 ref=85 dest=0x7f74080ba000
Unmapped first page
<now it's dead>
From the logs it looks like free_xenballooned_pages blocks forever:
[ 720.176089] INFO: task kworker/0:1:27 blocked for more than 120 seconds.
[ 720.176101] Not tainted 3.13.0-24-generic #46
[ 720.176105] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 720.176110] kworker/0:1 D 0000000000000000 0 27 2 0x00000000
[ 720.176118] Workqueue: events balloon_process
[ 720.176120] ffff88003db85b20 0000000000000202 ffff88003db597f0
ffff88003db85fd8
[ 720.176123] 0000000000014440 0000000000014440 ffff88003db597f0
ffff88003c6cfc98
[ 720.176125] ffff88003c6cfc9c ffff88003db597f0 00000000ffffffff
ffff88003c6cfca0
[ 720.176127] Call Trace:
[ 720.176133] [<ffffffff8171a3a9>] schedule_preempt_disabled+0x29/0x70
[ 720.176136] [<ffffffff8171c215>] __mutex_lock_slowpath+0x135/0x1b0
[ 720.176139] [<ffffffff8171c2af>] mutex_lock+0x1f/0x2f
[ 720.176142] [<ffffffff8148e92d>] device_attach+0x1d/0xa0
[ 720.176145] [<ffffffff8148dda8>] bus_probe_device+0x98/0xc0
[ 720.176147] [<ffffffff8148bc05>] device_add+0x4c5/0x640
[ 720.176149] [<ffffffff8148bd9a>] device_register+0x1a/0x20
[ 720.176158] [<ffffffff814a2370>] init_memory_block+0xd0/0xf0
[ 720.176161] [<ffffffff814a24b1>] register_new_memory+0x91/0xa0
[ 720.176164] [<ffffffff81705de0>] __add_pages+0x140/0x240
[ 720.176167] [<ffffffff81055649>] arch_add_memory+0x59/0xd0
[ 720.176170] [<ffffffff817060b4>] add_memory+0xe4/0x1f0
[ 720.176172] [<ffffffff8142c2d2>] balloon_process+0x382/0x420
[ 720.176175] [<ffffffff810838a2>] process_one_work+0x182/0x450
[ 720.176178] [<ffffffff81084641>] worker_thread+0x121/0x410
[ 720.176180] [<ffffffff81084520>] ? rescuer_thread+0x3e0/0x3e0
[ 720.176183] [<ffffffff8108b312>] kthread+0xd2/0xf0
[ 720.176185] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
[ 720.176188] [<ffffffff8172637c>] ret_from_fork+0x7c/0xb0
[ 720.176190] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
[ 720.176197] INFO: task test-gnt:957 blocked for more than 120 seconds.
[ 720.176203] Not tainted 3.13.0-24-generic #46
[ 720.176206] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 720.176210] test-gnt D 0000000000000000 0 957 954 0x00000001
[ 720.176213] ffff8800053f3d70 0000000000000202 ffff88003a3847d0
ffff8800053f3fd8
[ 720.176215] 0000000000014440 0000000000014440 ffff88003a3847d0
ffffffff81c99740
[ 720.176217] ffffffff81c99744 ffff88003a3847d0 00000000ffffffff
ffffffff81c99748
[ 720.176219] Call Trace:
[ 720.176222] [<ffffffff8171a3a9>] schedule_preempt_disabled+0x29/0x70
[ 720.176224] [<ffffffff8171c215>] __mutex_lock_slowpath+0x135/0x1b0
[ 720.176226] [<ffffffff8171c2af>] mutex_lock+0x1f/0x2f
[ 720.176229] [<ffffffff8142bb4e>] free_xenballooned_pages+0x1e/0x90
[ 720.176236] [<ffffffffa000c586>] gntdev_free_map+0x26/0x60 [xen_gntdev]
[ 720.176238] [<ffffffffa000c6f8>] gntdev_put_map+0xa8/0x100 [xen_gntdev]
[ 720.176241] [<ffffffffa000d2e2>] gntdev_ioctl+0x442/0x750 [xen_gntdev]
[ 720.176245] [<ffffffff811cc6e0>] do_vfs_ioctl+0x2e0/0x4c0
[ 720.176248] [<ffffffff81079e42>] ? ptrace_notify+0x82/0xc0
[ 720.176250] [<ffffffff811cc941>] SyS_ioctl+0x81/0xa0
[ 720.176252] [<ffffffff8172663f>] tracesys+0xe1/0xe6
[ 720.176254] INFO: task systemd-udevd:958 blocked for more than 120 seconds.
[ 720.176258] Not tainted 3.13.0-24-generic #46
[ 720.176261] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 720.176265] systemd-udevd D 0000000000000000 0 958 269 0x00000004
[ 720.176267] ffff88003bb0dd20 0000000000000202 ffff88003a382fe0
ffff88003bb0dfd8
[ 720.176269] 0000000000014440 0000000000014440 ffff88003a382fe0
ffffffff81c62040
[ 720.176271] ffffffff81c62044 ffff88003a382fe0 00000000ffffffff
ffffffff81c62048
[ 720.176273] Call Trace:
[ 720.176276] [<ffffffff8171a3a9>] schedule_preempt_disabled+0x29/0x70
[ 720.176278] [<ffffffff8171c215>] __mutex_lock_slowpath+0x135/0x1b0
[ 720.176280] [<ffffffff8171c2af>] mutex_lock+0x1f/0x2f
[ 720.176282] [<ffffffff81706ab3>] online_pages+0x33/0x570
[ 720.176285] [<ffffffff814a2108>] memory_subsys_online+0x68/0xd0
[ 720.176287] [<ffffffff8148c555>] device_online+0x65/0x90
[ 720.176289] [<ffffffff814a1d94>] store_mem_state+0x64/0x160
[ 720.176291] [<ffffffff81489ab8>] dev_attr_store+0x18/0x30
[ 720.176295] [<ffffffff8122f418>] sysfs_write_file+0x128/0x1c0
[ 720.176297] [<ffffffff811b9534>] vfs_write+0xb4/0x1f0
[ 720.176300] [<ffffffff811b9f69>] SyS_write+0x49/0xa0
[ 720.176302] [<ffffffff8172663f>] tracesys+0xe1/0xe6
Presumably the same problem would affect a userspace block backend using
grant mapping, like qemu disk? Or maybe I’m just doing it wrong (always
possible!) :-)
Cheers,
Dave
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |