Xen project Mailing List

[Xen-users] blktap and file-backed qcow: crashes and bad performance?

From: "Christoph Dwertmann" <lists.cd@xxxxxxxxx>

Date: Fri, 11 Aug 2006 16:59:00 +0200

Delivery-date: Fri, 11 Aug 2006 07:59:46 -0700

Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=Iu00901jUJaXYTFjXP8ov7Cukzc9hlYY9D4ec7OvfOlc+/z4I+EMR0H/ZPxbsenAgWX60KOC5tSe94ij88wcuOk656F9QRVPhA2dkDAMCDSQa4WmQR31MwHhlu7wv6OWT8w94Q7kbxs+L0yWHDeqI0rVrtTAXMnliXUjElQQV2E=

List-id: Xen user discussion <xen-users.lists.xensource.com>

Hi! I'm running the latest Xen unstable x86_64 on a Dell Poweredge 1950 Dual CPU Dual Core Xeon with 16GB RAM. I'm using file-backed sparse qcow images as root filesystems for the Xen guests. All qcow images are backed by the same image file (a 32bit Debian sid installation). The Xen disk config looks like this: disk = [ 'tap:qcow:/home/images/%s.%d.qcow,xvda1,w' % (vmname, vmid)] Before that I use the qcow-create tool to create those qcow files. I use grub to boot Xen like this: root (hd0,0) kernel /boot/xen-3.0-unstable.gz com2=57600,8n1 console=com2 dom0_mem=4097152 noreboot xenheap_megabytes=32 module /boot/xen0-linux root=/dev/sda1 ro noapic console=tty0 xencons=ttyS1 console=ttyS1 module /boot/xen0-linux-initrd My goal is to run 100+ Xen guests, but this seems impossible. I observe several things: - after creating a few Xen guests (and even after shutting them down), my process list is cluttered with "tapdisk" processes that put full load on all 8 virtual CPUs on the dom0. The system gets unuseable. Killing the tapdisk processes also apparently destroys the qcow images. - I (randomly?) get the messages "Error: (28, 'No space left on device')" or "Error: Device 0 (vif) could not be connected. Hotplug scripts not working." or even "Error: (12, 'Cannot allocate memory')" on domU creation. There is plenty of disk space and RAM available at that time. This mostly happens when creating more than 80 guests. - the dom0 will sooner or later crash with a message like this: ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at fs/aio.c:511 invalid opcode: 0000 [1] SMP CPU 0 Modules linked in: ipt_MASQUERADE iptable_nat ip_nat ip_conntrack nfnetlink ip_tables x_tables bridge dm_snapshot dm_mirror dm_mod usbhid ide_cd sers Pid: 46, comm: kblockd/0 Not tainted 2.6.16.13-xen-kasuari-dom0 #1 RIP: e030:[<ffffffff8018f8ee>] <ffffffff8018f8ee>{__aio_put_req+39} RSP: e02b:ffffffff803a89c8 EFLAGS: 00010086 RAX: 00000000ffffffff RBX: ffff8800f43d7a80 RCX: 00000000f3bdc000 RDX: 0000000000001458 RSI: ffff8800f43d7a80 RDI: ffff8800f62d1c80 RBP: ffff8800f62d1c80 R08: 6db6db6db6db6db7 R09: ffff88000193d000 R10: 0000000000000000 R11: ffffffff80153e48 R12: ffff8800f62d1ce8 R13: 0000000000000200 R14: 0000000000000000 R15: 0000000000000000 FS: 00002b9bf01bccb0(0000) GS:ffffffff80472000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process kblockd/0 (pid: 46, threadinfo ffff8800005e4000, task ffff8800005c57e0) Stack: ffff8800f43d7a80 ffff8800f62d1c80 ffff8800f62d1ce8 ffffffff80190082 ffff880004e83d10 ffff8800f4db7400 0000000000000200 ffff8800f4db7714 ffff8800f4db7400 0000000000000001 Call Trace: <IRQ> <ffffffff80190082>{aio_complete+297} <ffffffff80195b0b>{finished_one_bio+159} <ffffffff80195be8>{dio_bio_complete+150} <ffffffff80195d24>{dio_bio_end_aio+32} <ffffffff801cf1b7>{__end_that_request_first+328} <ffffffff801d00ca>{blk_run_queue+50} <ffffffff8800524d>{:scsi_mod:scsi_end_request+40} <ffffffff880054fe>{:scsi_mod:scsi_io_completion+525} <ffffffff880741ce>{:sd_mod:sd_rw_intr+598} <ffffffff88005792>{:scsi_mod:scsi_device_unbusy+85} <ffffffff801d1534>{blk_done_softirq+175} <ffffffff80132544>{__do_softirq+122} <ffffffff8010bada>{call_softirq+30} <ffffffff8010d231>{do_softirq+73} <ffffffff8010d626>{do_IRQ+65} <ffffffff8023bf5a>{evtchn_do_upcall+134} <ffffffff801d8a66>{cfq_kick_queue+0} <ffffffff8010b60a>{do_hypervisor_callback+30} <EOI> <ffffffff801d8a66>{cfq_kick_queue+0} <ffffffff8010722a>{hypercall_page+554} <ffffffff8010722a>{hypercall_page+554} <ffffffff801dac97>{kobject_get+18} <ffffffff8023b7aa>{force_evtchn_callback+10} <ffffffff8800641d>{:scsi_mod:scsi_request_fn+935} <ffffffff801d8adc>{cfq_kick_queue+118} <ffffffff8013d3e6>{run_workqueue+148} <ffffffff8013db18>{worker_thread+0} <ffffffff80140abd>{keventd_create_kthread+0} <ffffffff8013dc08>{worker_thread+240} <ffffffff80125cdb>{default_wake_function+0} <ffffffff80140abd>{keventd_create_kthread+0} <ffffffff80140abd>{keventd_create_kthread+0} <ffffffff80140d61>{kthread+212} <ffffffff8010b85e>{child_rip+8} <ffffffff80140abd>{keventd_create_kthread+0} <ffffffff80140c8d>{kthread+0} <ffffffff8010b856>{child_rip+0} Code: 0f 0b 68 c3 9b 2f 80 c2 ff 01 85 c0 74 07 31 c0 e9 09 01 00 RIP <ffffffff8018f8ee>{__aio_put_req+39} RSP <ffffffff803a89c8> <0>Kernel panic - not syncing: Aiee, killing interrupt handler! (XEN) Domain 0 crashed: 'noreboot' set - not rebooting. Is it just my setup or - does Xen not scale at all to 100+ machines? - does blktap not scale at all? - is blktap with qcow very unstable right now? Thank you for any pointers, -- Christoph Dwertmann cdwertmann at gmx dot de _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.