[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Qemu crashed while unpluging IDE disk



While starting a Fedora_14 guest, we came across a segfault of qemu:

the logs in /var/log/messages are: 
Jun  1 02:38:56 NC587 kernel: [403549.565754] show_signal_msg: 136 callbacks 
suppressed
Jun  1 02:38:56 NC587 kernel: [403549.565758] qemu-system-i38[25840]: segfault 
at 28 ip 0000000000418d91 sp 00007fe02aef4f00 error 4 in 
qemu-system-i386[400000+350000]

the very segfault refers to the code:
/*
 * Handle a read request in coroutine context
 */
static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs,
    int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
    BdrvRequestFlags flags)
{
    BlockDriver *drv = bs->drv;    //The segfault occurs when bs equals to NULL.
    BdrvTrackedRequest req;
    int ret;


NOTE: we are running on a XEN hypervisor with qemu 1.2.0        
        
It's just serveral seconds after we start the guest. 
It seems that the guest is just begining to take charge of blockdev's io by 
itself from qemu, 
meanwhile,  qemu had just flushed disk io and unplugged the disk.
FYR: Our provided a pvdriver for the guest which takes charge of blkfront and 
netfront, 
thus, it triggered qemu to unplug the disk here.

We're confused why qemu would read disk io, after it had unplugged it. 

By checking the UNPLUG codes, we see a comment as shown below:

/*
 * Wait for pending requests to complete across all BlockDriverStates
 *
 * This function does not flush data to disk, use bdrv_flush_all() for that
 * after calling this function.
 *
 * Note that completion of an asynchronous I/O operation can trigger any
 * number of other I/O operations on other devices---for example a coroutine
 * can be arbitrarily complex and a constant flow of I/O can come until the
 * coroutine is complete.  Because of this, it is not possible to have a
 * function to drain a single device's I/O queue.
 */
void bdrv_drain_all(void)
{
    BlockDriverState *bs;
        
Does that mean, as that we're now using coroutine mechnism to deal with disk 
io, there's possibility that we could come cross the problem described above?

Any ideas?  Thanks!
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.