[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] RE: ext4 BUG in dom0 Kernel 2.6.32.36
Hi: I finally captured extents overlaped in the ext4. But still wondering how it happen. I checked overlap for the last extent in the tree at the very beginning of ext4_ext_convert_to_initialized. Messages.12 attached show the overlap found. Line 8-10: 3467:[1]15:57921642 3468:[0]14:57921643 has overlaped. 8 Sep 15 08:27:39 xmao kernel: 3331:[0]7:53750025 3338:[0]8:53750033 3346:[0]1:53848953 3347:[0]7:53848955 3354:[0]1:53848969 3355:[0]7:53848971 3362:[0]1:53848985 3363:[0]7:56996848 3370:[0]1:57606144 3371:[0]7:57795290 3378:[0]1:57814407 3379:[0]7:57858606 3386:[0]8:57858620 3394:[0]1:57858629 3395:[0]8:57858637 3403:[0]7:57858646 3410:[0]1:57858661 3411:[0]8:57858669 3419:[0]7:57858678 3426:[0]8:57858692 3434:[0]1:57858701 3435:[0]7:57858709 3442:[0]1:57858717 3443:[0]7:57858725 3450:[0]1:57858733 3451:[0]7:57858741 3458:[0]1:57858749 3459:[0]7:57858757 3466:[0]1:57921634 3467:[1]15:57921642 9 Sep 15 08:27:39 xmao kernel: Displaying leaf extents for inode 12339004 10 Sep 15 08:27:39 xmao kernel: 3468:[0]14:57921643 3482:[0]1:57921664 3483:[0]7:57921666 3490:[0]1:57921680 3491:[0]8:57921682 3499:[0]7:57921691 3506:[0]8:57921705 3514:[0]1:57921714 3515:[0]7:57921722 3522:[0]41:57916683 3563:[0]7:58159767 3570:[0]1:58159781 3571:[0]7:58238992 3578:[0]1:58288144 3579:[0]7:58327750 3586:[0]1:58579969 3587:[0]7:58954838 3594:[0]1:59006641 3595:[0]7:59006643 3602:[0]1:59006657 3603:[0]7:59006659 3610:[0]8:59006673 3618:[0]8:59006688 3626:[0]470:58982658 4096:[0]3:58987732 4099:[0]1:58992655 4100:[0]7:59143253 4107:[0]1:59171840 4108:[0]7:59183878 4115:[0]1:59192886 4116:[0]8:59593463 4124:[0]8:59669484 4132:[0]7:73086538 4139:[0]1:73352801 4140:[0]7:73339273 4147:[0]1:73526280 4148:[0]8:78229012 4156:[0]1:78229021 4157:[0]7:78818388 4164:[0]1:79069383 4165:[0]7:79428616 4172:[0]1:80490925 4173:[0]7:81439488 4180:[0]1:82854062 4181:[0]7:83462272 4188:[0]1:83656904 4189:[0]7:89127381 4196:[0]1:89584313 4197:[0]8:91592930 4205:[0]7:91592945 4212:[0]1:91592953 4213:[0]7:91592961 422 I also dumped file in disk use filefrag which show no overlap, no extent 3468:[0]14:57921643. ext logical physical expected length flags .... 337 3459 57858757 57858749 7 338 3466 57921634 57858763 1 unwritten 339 3467 57921642 57921634 15 unwritten 340 3482 57921664 57921656 1 341 3483 57921666 57921664 7 ..... There is one assumption, After 3468:[0]14:57921643 successfully inserted, there is something err happen. At the bottom of ext4_ext_convert_to_initialized, fix_extent_len will fix the origin ex ee_len.(Later I will do the err check) 3403 fix_extent_len: 3404 ex->ee_block = orig_ex.ee_block; 3405 ex->ee_len = orig_ex.ee_len; 3406 ext4_ext_store_pblock(ex, ext_pblock(&orig_ex)); 3407 ext4_ext_mark_uninitialized(ex); 3408 ext4_ext_dirty(handle, inode, path + depth); Any comments? Well, but something strange messages.12. message.12 is from another machine, it log is printf right before BUG_ON(newext->ee_block == nearex->ee_block); strange is 14412:[1]16:9927's pblock is much different from 14411:[0]1:222332613. 1993 if(newext->ee_block == nearex->ee_block){ 1994 len = (EXT_MAX_EXTENT(eh) - nearex) * sizeof(struct ext4_extent); 1995 len = len < 0 ? 0 : len; 1996 printk("old_depth %d depth %d old_path %p path %p next_has_free %d next %llu\n", 1997 old_depth, depth, old_path, path, next_has_free, (unsigned long long)next); 2004 2005 printk("insert %d:%llu:[%d]%d before: nearest 0x%p, " 2006 "move %d from 0x%p to 0x%p\n", 2007 le32_to_cpu(newext->ee_block), 2008 ext_pblock(newext), 2009 ext4_ext_is_uninitialized(newext), 2010 ext4_ext_get_actual_len(newext), 2011 nearex, len, nearex + 1, nearex + 2); 2012 ext4_ext_show_leaf_xmao(inode, old_path); 2013 ext4_ext_show_leaf_xmao(inode, path); 2014 }; 2015 BUG_ON(newext->ee_block == nearex->ee_block); Sep 13 16:16:35 xmao kernel: 57:[0]31:157254721 12288:[0]54:157503830 12342:[0]10:157503884 12352:[0]5:157534763 12357:[0]1:157534768 12358:[0]58:157534769 12416:[0]64:157567168 12480:[0]13:158051261 12493:[0]73:172263095 12566:[0]24:172265399 12590:[0]71:172521859 12661:[0]71:172627897 12732:[0]71:172733735 12803:[0]69:172722619 12872:[0]9:172764859 12881:[0]42:110500028 12923:[0]86:143030061 13009:[0]86:143119859 13095:[0]48:143173376 13143:[0]16:195333586 13159:[0]32:197526105 13191:[0]40:198875861 13231:[0]39:198872300 13270:[0]5:199663576 13275:[0]26:200964192 13301:[0]36:202015708 13337:[0]47:202221682 13384:[0]9:202221729 13393:[0]58:202624966 13451:[0]12:202606535 13463:[0]35:212117725 13498:[0]35:212135811 13533:[0]34:212115513 13567:[0]32:212108608 13599:[0]29:212144185 13628:[0]50:231280420 13678:[0]38:231645389 13716:[0]13:231645427 13729:[0]51:231650765 13780:[0]50:231647658 13830:[0]54:231985340 13884:[0]24:231981259 13908:[0]64:105098731 13972:[0]87:136696745 14059:[0]45:136700237 14104:[0]61:2 Sep 13 16:16:35 xmao kernel: 3651 14165:[0]69:222042299 14234:[0]68:222044092 14302:[0]34:222091761 14336:[0]68:222172860 14404:[0]7:222332606 14411:[0]1:222332613 Sep 13 16:16:35 xmao kernel: Displaying leaf extents for inode 30685060 Sep 13 16:16:35 xmao kernel: 14412:[1]16:9927 14428:[1]41:13213 14469:[1]1:13254 14470:[0]67:222673085 Also, filefrag show extents is ok. 336 14302 222091761 222044159 34 337 14336 222172860 222091794 68 338 14404 222332606 222172927 7 339 14411 222332613 59 unwritten 340 14470 222673085 222332671 67 341 14537 222848155 222673151 43 342 14580 165617358 222848197 56 343 14636 165777353 165617413 55 344 14691 165961927 165777407 57 seems 14412:[1]16:9927 14428:[1]41:13213 14469:[1]1:13254 is unexpected. Many thanks. ---------------------------------------- > From: tinnycloud@xxxxxxxxxxx > To: jeremy@xxxxxxxx > CC: konrad.wilk@xxxxxxxxxx; linux-ext4@xxxxxxxxxxxxxxx; > xen-devel@xxxxxxxxxxxxxxxxxxx > Subject: RE: [Xen-devel] RE: ext4 BUG in dom0 Kernel 2.6.32.36 > Date: Wed, 7 Sep 2011 10:35:21 +0800 > > > > > ---------------------------------------- > > Date: Tue, 6 Sep 2011 11:55:02 -0700 > > From: jeremy@xxxxxxxx > > To: tinnycloud@xxxxxxxxxxx > > CC: konrad.wilk@xxxxxxxxxx; linux-ext4@xxxxxxxxxxxxxxx; > > xen-devel@xxxxxxxxxxxxxxxxxxx > > Subject: Re: [Xen-devel] RE: ext4 BUG in dom0 Kernel 2.6.32.36 > > > > On 09/06/2011 08:11 AM, MaoXiaoyun wrote: > > > > > > > Date: Tue, 6 Sep 2011 10:53:47 -0400 > > > > From: konrad.wilk@xxxxxxxxxx > > > > To: tinnycloud@xxxxxxxxxxx > > > > CC: linux-ext4@xxxxxxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx; > > > jeremy@xxxxxxxx > > > > Subject: Re: ext4 BUG in dom0 Kernel 2.6.32.36 > > > > > > > > On Tue, Sep 06, 2011 at 03:24:14PM +0800, MaoXiaoyun wrote: > > > > > > > > > > > > > > > Hi: > > > > > > > > > > I've met an ext4 Bug in dom0 kernel 2.6.32.36. (See kernel stack > > > below) > > > > > > > > Did you try the 3.0 kernel? > > > No, I am afried the change would be to much for our current env. > > > May result in other stable issue. > > > So, I want to dig out what really happen. Hopes. > > > > Another question is whether this is a regression compared to earlier > > versions of 2.6.32? Do you know if this problem exists in a non-Xen > > environment? > > > > There are some others reports this issue in non-xen env. > http://markmail.org/message/ywr4nfgiuvgdcr7y > http://www.spinics.net/lists/linux-ext4/msg21066.html > > The difficulty is I haven't find a efficient way to reproduce it. > (Currently it only show in our cluster, redeploy our cluster may cost 3days > more. ) > > > > Thanks, > > J > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Attachment:
messages.12 Attachment:
messages.15 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |