[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

To: Sylvain Munaut <s.munaut@xxxxxxxxxxxxxxxxxxxx>
From: James Harper <james.harper@xxxxxxxxxxxxxxxx>
Date: Mon, 12 Aug 2013 23:26:00 +0000
Accept-language: en-AU, en-US
Cc: "ceph-devel@xxxxxxxxxxxxxxx" <ceph-devel@xxxxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>
Delivery-date: Mon, 12 Aug 2013 23:26:54 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>
Thread-index: AQHOPEY4lRRJxH6DUU2f/QWxcO8DPZjccsAAgKPNncCABh88AIAGUUfQ///ytACAAzxNQIAAAwtQgAHJP4CAAT/zAA==
Thread-topic: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

> >> > > tapdisk[9180]: segfault at 7f7e3a5c8c10 ip 00007f7e387532d4 sp
> >> > 00007f7e3a5c8c10 error 4 in libpthread-2.13.so[7f7e38748000+17000]
> >> > > tapdisk:9180 blocked for more than 120 seconds.
> >> > > tapdisk         D ffff88043fc13540     0  9180      1 0x00000000
> 
> You can try generating a core file by changing the ulimit on the running
> process
> 
> A backtrace would be useful :)
> 

I found it was actually dumping core in /, but gdb doesn't seem to work nicely 
and all I get is this:

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Cannot find new threads: generic error
Core was generated by `tapdisk'.
Program terminated with signal 11, Segmentation fault.
#0  pthread_cond_wait@@GLIBC_2.3.2 () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:163
163     ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: No such 
file or directory.

Even when I attach to a running process.

One VM segfaults on startup, pretty much everytime except never when I attach 
strace to it, meaning it's probably a race condition and may not actually be in 
your code...

> 
> > Actually maybe not. What I was reading only applies for large number of
> > bytes written to the pipe, and even then I got confused by the double
> > negatives. Sorry for the noise.
> 
> Yes, as you discovered but size < PIPE_BUF, they should be atomic even
> in non-blocking mode. But I could still add assert() there to make
> sure it is.

Nah I got that completely backwards. I see now you are only passing a pointer 
so yes it should never be non-atomic.

> I did find a bug where it could "leak" requests which may lead to
> hang. But it shouldn't crash ...
> 
> Here's an (untested yet) patch in the rbd error path:
> 

I'll try that later this morning when I get a minute.

I've done the poor-mans-debugger thing and riddled the code with printf's but 
as far as I can determine every routine starts and ends. My thinking at the 
moment is that it's either a race (the VM's most likely to crash have multiple 
disks), or a buffer overflow that trips it up either immediately, or later.

I have definitely observed multiple VM's crash when something in ceph hiccup's 
(eg I bring a mon up or down), if that helps.

I also followed through the rbd_aio_release idea on the weekend - I can see 
that if the read returns failure it means the callback was never called so the 
release is then the responsibility of the caller.

Thanks

James


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

References:
- Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p
  - From: James Harper
- Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p
  - From: Sylvain Munaut
- Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p
  - From: James Harper
- Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p
  - From: Sylvain Munaut
- Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p
  - From: James Harper
- Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p
  - From: James Harper
- Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p
  - From: Sylvain Munaut

Prev by Date: Re: [Xen-devel] [PATCH v6 4/6] xen/arm: Add the new OMAP UART driver.
Next by Date: Re: [Xen-devel] [V10 PATCH 09/23] PVH xen: introduce pvh_set_vcpu_info() and vmx_pvh_set_vcpu_info()
Previous by thread: Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p
Next by thread: Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.