[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] Re: Crash on blktap shutdown
On 02/24/2010 07:03 PM, Daniel Stodden wrote: On Wed, 2010-02-24 at 20:47 -0500, Daniel Stodden wrote:On Wed, 2010-02-24 at 19:37 -0500, Jeremy Fitzhardinge wrote:On 02/24/2010 04:29 PM, Daniel Stodden wrote:On Wed, 2010-02-24 at 18:52 -0500, Jeremy Fitzhardinge wrote:On 02/24/2010 03:49 PM, Daniel Stodden wrote:On Wed, 2010-02-24 at 17:55 -0500, Jeremy Fitzhardinge wrote:When rebooting the machine, I got this crash from blktap. The rip maps to line 262 in 0xffffffff812548a1 is in blktap_request_pool_free (/home/jeremy/git/linux/drivers/xen/blktap/request.c:262).Uhm, where did that RIP come from? pool_free is on the module exit path. The stack trace below looks like a crash from the broadcasted SIGTERM before reboot.Ignore it; I generated it from a different kernel from the one that crashed. But the other oops I posted should be all consistent and meaningful.Ignore only the debuginfo quote, right? Cos this looks like a different issue to me.Perhaps. I got all the others on normal domain shutdown, but this one was on machine reboot. I'll try to repro (as I boot the test kernel with your patch in it).(gdb) list *(blktap_device_restart+0x7a) 0x2a73 is in blktap_device_restart (/local/exp/dns/scratch/xenbits/xen-unstable.hg/linux-2.6-pvops.git/drivers/xen/blktap/device.c:920). 915 /* Re-enable calldowns. */ 916 if (blk_queue_stopped(dev->gd->queue)) 917 blk_start_queue(dev->gd->queue); 918 919 /* Kick things off immediately. */ 920 blktap_device_do_request(dev->gd->queue); 921 922 spin_unlock_irq(&dev->lock); 923 } 924 Assuming we've been dereferencing a NULL gendisk, i.e. device_destroy racing against device_restart. Would take * Tapdisk killed on the other thread, which goes through into a device_restart(). Which is what your stacktrace shows. * Device removal pending, blocking until device->users drops to 0, then doing the device_destroy(). That might have happened during bdev .release. Both running at the same time sounds like what happens if you kill them all at once. That clearly takes another patch then.Jeremy, can you try out the attached patch for me? This should close the above shutdown race as well. Should be nowhere as frequent as the timer_sync crash fixed earlier. Hm, the two patches changed things but I'm still seeing problems on domain shutdown. Still looks like use-after-free. blktap_device_destroy: destroy device 0 users 0 blktap_ring_vm_close: unmapping ring 0 blktap_ring_release: freeing device 0 blktap_sysfs_destroy ============================================================================= BUG kmalloc-512: Poison overwritten ----------------------------------------------------------------------------- INFO: 0xffff88002e9e2048-0xffff88002e9e2048. First byte 0x6a instead of 0x6b INFO: Allocated in device_create_vargs+0x47/0xd7 age=7705 cpu=0 pid=3072 INFO: Freed in device_create_release+0x9/0xb age=14 cpu=0 pid=3320 INFO: Slab 0xffff880003cca5b0 objects=14 used=2 fp=0xffff88002e9e2000 flags=0xa3 INFO: Object 0xffff88002e9e2000 @offset=0 fp=0xffff88002e9e2248 Object 0xffff88002e9e2000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2020: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2030: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2040: 6b 6b 6b 6b 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2050: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2060: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2070: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2090: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e20a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e20b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e20c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e20d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e20e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e20f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2100: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2110: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2120: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2130: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2140: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2150: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2160: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2170: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2180: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e2190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e21a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e21b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e21c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e21d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e21e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kk Object 0xffff88002e9e21f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kï Redzone 0xffff88002e9e2200: bb bb bb bb bb bb bb bb ï Padding 0xffff88002e9e2240: 5a 5a 5a 5a 5a 5a 5a 5a Z Pid: 3327, comm: ifdown Not tainted 2.6.32 #358 Call Trace: [<ffffffff810a83f9>] print_trailer+0x16a/0x173 [<ffffffff810a89a0>] check_bytes_and_report+0xb5/0xe6 [<ffffffff810a8a96>] check_object+0xc5/0x237 [<ffffffff810aa588>] __slab_alloc+0x493/0x591 [<ffffffff810e8fea>] ? load_elf_binary+0xe2/0x17d8 [<ffffffff810e8fea>] ? load_elf_binary+0xe2/0x17d8 [<ffffffff810ab06f>] __kmalloc+0xbe/0x12f [<ffffffff810e8fea>] load_elf_binary+0xe2/0x17d8 [<ffffffff8100e921>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff8100e921>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff8100eff2>] ? check_events+0x12/0x20 [<ffffffff810b3ee9>] ? search_binary_handler+0x18f/0x278 [<ffffffff810e0208>] ? flock_to_posix_lock+0x4/0xe1 [<ffffffff810b3e2c>] ? search_binary_handler+0xd2/0x278 [<ffffffff8100efdf>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff81064f38>] ? lock_release+0x15a/0x166 [<ffffffff810e0208>] ? flock_to_posix_lock+0x4/0xe1 [<ffffffff810b3e39>] search_binary_handler+0xdf/0x278 [<ffffffff810e8f08>] ? load_elf_binary+0x0/0x17d8 [<ffffffff810b5453>] do_execve+0x185/0x27a [<ffffffff81010673>] sys_execve+0x3e/0x5c [<ffffffff8101209a>] stub_execve+0x6a/0xc0 FIX kmalloc-512: Restoring 0xffff88002e9e2048-0xffff88002e9e2048=0x6b J _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |