[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] gntdev/gntalloc and fork
Hi, What is the proper way to handle shared pages (either side - using gntdev or gntalloc) regarding fork and possible exec later? The child process do not need to access those pages in any way, but will map different one(s), using newly opened FD to the gntdev/gntalloc device. Should it unmap them and close FD to the device manually just after the fork? Or the process using gntdev or gntalloc should prevent using fork at all? I'm asking because I get kernel oops[1] in context of such process. This process uses both gntdev and gntalloc. The PID reported there is a child, which maps additional pages (using newly opened FD to /dev/xen/gnt*), but I'm not sure if the crash happens before, after or at this second mapping (actually vchan connection), or maybe even at cleanup of this second mapping. The parent process keeps its mappings for the whole lifetime of its child. I don't have a 100% reliable way to reproduce this problem, but it happens quite often when I run such operations in a loop. The kernel is vanilla 3.19.3, running on Xen 4.4.2. The kernel message: [74376.073464] general protection fault: 0000 [#1] SMP [74376.073475] Modules linked in: fuse xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip6table_filter ip6_tables intel_rapl iosf_mbi x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel pcspkr xen_netfront ghash_clmulni_intel nfsd auth_rpcgss nfs_acl lockd grace xenfs xen_privcmd dummy_hcd udc_core xen_gntdev xen_gntalloc xen_blkback sunrpc u2mfn(O) xen_evtchn xen_blkfront [74376.073522] CPU: 1 PID: 9377 Comm: qrexec-agent Tainted: G O 3.19.3-4.pvops.qubes.x86_64 #1 [74376.073528] task: ffff880002442e40 ti: ffff88000032c000 task.ti: ffff88000032c000 [74376.073532] RIP: e030:[<ffffffffa00952c5>] [<ffffffffa00952c5>] unmap_if_in_range+0x15/0xd0 [xen_gntdev] [74376.073543] RSP: e02b:ffff88000032fc08 EFLAGS: 00010292 [74376.073546] RAX: 0000000000000000 RBX: dead000000100100 RCX: 00007fd8616ea000 [74376.073550] RDX: 00007fd8616ea000 RSI: 00007fd8616e9000 RDI: dead000000100100 [74376.073554] RBP: ffff88000032fc48 R08: 0000000000000000 R09: 0000000000000000 [74376.073557] R10: ffffea000021bb00 R11: 0000000000000000 R12: 00007fd8616e9000 [74376.073561] R13: 00007fd8616ea000 R14: ffff880012702e40 R15: ffff880012702e70 [74376.073569] FS: 00007fd8616ca700(0000) GS:ffff880013c80000(0000) knlGS:0000000000000000 [74376.073574] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [74376.073577] CR2: 00007fd8616e9458 CR3: 00000000e7af5000 CR4: 0000000000042660 [74376.073582] Stack: [74376.073584] ffff8800188356c0 00000000000000d0 ffff88000032fc68 00000000c64ef797 [74376.073590] 0000000000000220 dead000000100100 00007fd8616e9000 00007fd8616ea000 [74376.073596] ffff88000032fc88 ffffffffa00953c6 ffff88000032fcc8 ffff880012702e70 [74376.073603] Call Trace: [74376.073610] [<ffffffffa00953c6>] mn_invl_range_start+0x46/0x90 [xen_gntdev] [74376.073620] [<ffffffff811e88fb>] __mmu_notifier_invalidate_range_start+0x5b/0x90 [74376.073627] [<ffffffff811c2a59>] do_wp_page+0x769/0x820 [74376.074031] [<ffffffff811c4f5c>] handle_mm_fault+0x7fc/0x10c0 [74376.074031] [<ffffffff813864cd>] ? radix_tree_lookup+0xd/0x10 [74376.074031] [<ffffffff81061e1c>] __do_page_fault+0x1dc/0x5a0 [74376.074031] [<ffffffff817560a6>] ? mutex_lock+0x16/0x37 [74376.074031] [<ffffffffa0008928>] ? evtchn_ioctl+0x118/0x3c0 [xen_evtchn] [74376.074031] [<ffffffff812209d8>] ? do_vfs_ioctl+0x2f8/0x4f0 [74376.074031] [<ffffffff811cafdf>] ? do_munmap+0x29f/0x3b0 [74376.074031] [<ffffffff81062211>] do_page_fault+0x31/0x70 [74376.074031] [<ffffffff81759e28>] page_fault+0x28/0x30 [74376.074031] Code: e9 dd fd ff ff 31 c9 31 db e9 20 fe ff ff 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 89 fb 48 83 ec 28 <48> 8b 47 10 48 85 c0 74 4e 4c 8b 00 49 39 d0 73 46 4c 8b 48 08 [74376.074031] RIP [<ffffffffa00952c5>] unmap_if_in_range+0x15/0xd0 [xen_gntdev] [74376.074031] RSP <ffff88000032fc08> [74376.091682] ---[ end trace 2b21c5b714eb1071 ]--- [74404.069009] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [qrexec-agent:9379] [74404.069009] Modules linked in: fuse xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip6table_filter ip6_tables intel_rapl iosf_mbi x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel pcspkr xen_netfront ghash_clmulni_intel nfsd auth_rpcgss nfs_acl lockd grace xenfs xen_privcmd dummy_hcd udc_core xen_gntdev xen_gntalloc xen_blkback sunrpc u2mfn(O) xen_evtchn xen_blkfront [74404.069009] CPU: 2 PID: 9379 Comm: qrexec-agent Tainted: G D O 3.19.3-4.pvops.qubes.x86_64 #1 [74404.069009] task: ffff880010e24a00 ti: ffff880002470000 task.ti: ffff880002470000 [74404.069009] RIP: e030:[<ffffffff81757b11>] [<ffffffff81757b11>] _raw_spin_lock+0x21/0x30 [74404.069009] RSP: e02b:ffff880002473e18 EFLAGS: 00000297 [74404.069009] RAX: 0000000000000040 RBX: ffff880002345c00 RCX: 0000000000018cf8 [74404.069009] RDX: 0000000000000041 RSI: ffff880002345c00 RDI: ffff880012702e60 [74404.069009] RBP: ffff880002473e18 R08: ffff880012702240 R09: 00000001802a0019 [74404.069009] R10: ffffea000049c080 R11: ffffffffa00955bf R12: ffff880012702e70 [74404.069009] R13: ffff880012702e40 R14: ffff8800132c6f20 R15: ffff880012b163c0 [74404.069009] FS: 00007fd8616ca700(0000) GS:ffff880013d00000(0000) knlGS:0000000000000000 [74404.069009] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [74404.069009] CR2: 00007fd8610be098 CR3: 000000000b971000 CR4: 0000000000042660 [74404.069009] Stack: [74404.069009] ffff880002473e48 ffffffffa0095452 ffff880002473e48 ffff880002345c00 [74404.069009] ffff880012702e70 0000000000000000 ffff880002473e78 ffffffff811e8c2e [74404.069009] ffff880002473e78 ffff880012702e40 ffff880012702e40 ffff880012d123c8 [74404.069009] Call Trace: [74404.069009] [<ffffffffa0095452>] mn_release+0x22/0x130 [xen_gntdev] [74404.069009] [<ffffffff811e8c2e>] mmu_notifier_unregister+0x4e/0xe0 [74404.069009] [<ffffffffa00957c0>] gntdev_release+0x60/0xa0 [xen_gntdev] [74404.069009] [<ffffffff8120ec0f>] __fput+0xdf/0x1e0 [74404.069009] [<ffffffff8120ed5e>] ____fput+0xe/0x10 [74404.069009] [<ffffffff810b56df>] task_work_run+0xbf/0x100 [74404.069009] [<ffffffff81014c47>] do_notify_resume+0x97/0xb0 [74404.069009] [<ffffffff81758127>] int_signal+0x12/0x17 [74404.069009] Code: 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 b8 00 01 00 00 f0 66 0f c1 07 0f b6 d4 38 c2 75 04 5d c3 f3 90 0f b6 07 <38> d0 75 f7 5d c3 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 -- Best Regards, Marek Marczykowski-GÃrecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? Attachment:
pgpngkpSKagb4.pgp _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |