[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] kernel panic in skb_copy_bits
--On 4 July 2013 03:12:10 -0700 Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote: It looks like a typical COW issue to me. If the page content is written while there is still a reference on this page, we should allocate a new page and copy the previous content. And this has little to do with networking. I suspect this would get more attention if we could make Ian's case below trigger (a) outside Xen, (b) outside networking. memset(buf, 0xaa, 4096); write(fd, buf, 4096) memset(buf, 0x55, 4096); (where fd is O_DIRECT on NFS) Can result in 0x55 being seen on the wire in the TCP retransmit. We know this should fail using O_DIRECT+NFS. We've had reports suggesting it fails in O_DIRECT+iSCSI. However, that's been with a kernel panic (under Xen) rather than data corruption as per the above. Historical trawling suggests this is an issue with DRDB (see Ian's original thread from the mists of time). I don't quite understand why we aren't seeing corruption with standard ATA devices + O_DIRECT and no Xen involved at all. My memory is a bit misty on this but I had thought the reason why this would NOT be solved simply by O_DIRECT taking a reference to the page was that the O_DIRECT I/O completed (and thus the reference would be freed up) before the networking stack had actually finished with the page. If the O_DIRECT I/O did not complete until the page was actually finished with, we wouldn't see the problem in the first place. I may be completely off base here. -- Alex Bligh _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |