[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Out sw-iommu space problem



On Wed, Sep 14, 2011 at 07:44:14AM -0700, Fantu wrote:
> We have dom0 Squeeze with official kernel (2.6.32-35squeeze2) and xen
> 4.0.2-rc3 from git testing on DELL poweredge T310 with raid controller H200
> (all with latest firmware).
> Some domus linux pv and some windows with gplpv.
> We randomly running into this error during xendomains stop command (with
> save of all domus):

Whoa. that is impressive. Try booting your dom0 with swiotlb=128768

or some other large number.
> 
> Sep 14 16:18:40 heliMN02WV kernel: [  912.336945] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:18:40 heliMN02WV kernel: [  912.336951] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:18:41 heliMN02WV kernel: [  912.400331] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:18:41 heliMN02WV kernel: [  912.400336] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:18:45 heliMN02WV kernel: [  917.187524] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:18:45 heliMN02WV kernel: [  917.187533] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:18:51 heliMN02WV kernel: [  922.454599] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:18:51 heliMN02WV kernel: [  922.454608] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:18:51 heliMN02WV kernel: [  922.454694] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:18:51 heliMN02WV kernel: [  922.454697] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:19:11 heliMN02WV kernel: [  942.850048] frontend_changed:
> backend/vbd/4/768: prepare for reconnect
> Sep 14 16:19:11 heliMN02WV kernel: [  942.869244] eth0: port 5(vif4.0)
> entering disabled state
> Sep 14 16:19:11 heliMN02WV kernel: [  942.889192] eth0: port 5(vif4.0)
> entering disabled state
> Sep 14 16:19:11 heliMN02WV kernel: [  943.188048] frontend_changed:
> backend/vif/4/0: prepare for reconnect
> Sep 14 16:19:22 heliMN02WV kernel: [  954.246090] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:19:22 heliMN02WV kernel: [  954.246095] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:19:22 heliMN02WV kernel: [  954.305068] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:19:22 heliMN02WV kernel: [  954.305074] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:19:34 heliMN02WV kernel: [  966.112058] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:19:34 heliMN02WV kernel: [  966.112064] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:19:34 heliMN02WV kernel: [  966.112251] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:19:34 heliMN02WV kernel: [  966.112255] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:19:34 heliMN02WV kernel: [  966.112440] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:19:34 heliMN02WV kernel: [  966.112443] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:19:34 heliMN02WV kernel: [  966.205690] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:19:34 heliMN02WV kernel: [  966.205693] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:19:40 heliMN02WV kernel: [  971.728913] eth0: port 6(vif5.0)
> entering disabled state
> Sep 14 16:19:40 heliMN02WV kernel: [  971.752683] eth0: port 6(vif5.0)
> entering disabled state
> Sep 14 16:19:45 heliMN02WV kernel: [  976.984329] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:19:45 heliMN02WV kernel: [  976.984333] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:19:49 heliMN02WV kernel: [  981.288632] eth0: port 7(vif6.0)
> entering disabled state
> Sep 14 16:19:49 heliMN02WV kernel: [  981.304521] eth0: port 7(vif6.0)
> entering disabled state
> Sep 14 16:19:50 heliMN02WV kernel: [  982.329740] frontend_changed:
> backend/vbd/7/768: prepare for reconnect
> Sep 14 16:19:50 heliMN02WV kernel: [  982.372593] eth0: port 8(vif7.0)
> entering disabled state
> Sep 14 16:19:51 heliMN02WV kernel: [  982.416506] eth0: port 8(vif7.0)
> entering disabled state
> Sep 14 16:19:51 heliMN02WV kernel: [  982.744206] frontend_changed:
> backend/vif/7/0: prepare for reconnect
> Sep 14 16:20:00 heliMN02WV kernel: [  991.520780] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:20:00 heliMN02WV kernel: [  991.520787] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:20:00 heliMN02WV kernel: [  991.524695] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:20:00 heliMN02WV kernel: [  991.524698] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:20:00 heliMN02WV kernel: [  991.525040] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:20:00 heliMN02WV kernel: [  991.525042] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:20:00 heliMN02WV kernel: [  991.525371] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:20:00 heliMN02WV kernel: [  991.525374] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:20:00 heliMN02WV kernel: [  991.527766] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:20:00 heliMN02WV kernel: [  991.527769] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:20:01 heliMN02WV kernel: [  992.493163] mpt2sas 0000:03:00.0: DMA:
> Out of SW-IOMMU space for 65536 bytes.
> Sep 14 16:20:01 heliMN02WV kernel: [  992.493167] sd 0:1:0:0: pci_map_sg
> failed: request for 524288 bytes!
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.938378] tapdisk2[7617]: segfault
> at 7fff92fb6ff8 ip 0000000000408296 sp 00007fff92fb7000 error 6 in
> tapdisk2[400000+39000]
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.959533] BUG: unable to handle
> kernel NULL pointer dereference at 0000000000000048
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.959681] IP: [<ffffffff810ce79e>]
> apply_to_page_range+0x47/0x2f3
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.959773] PGD 3dc5f067 PUD 3db57067
> PMD 0 
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.959914] Oops: 0000 [#1] SMP 
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.960026] last sysfs file:
> /sys/devices/virtual/blktap2/blktap11/remove
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.960084] CPU 5 
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.960161] Modules linked in:
> xt_tcpudp tun xt_physdev iptable_filter ip_tables x_tables bridge stp
> ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp
> libiscsi_tcp libiscsi scsi_transport_iscsi ext2 sha256_generic aes_x86_64
> aes_generic cbc blktap xen_evtchn xenfs loop dm_crypt dcdbas snd_pcm
> snd_timer snd joydev evdev soundcore snd_page_alloc pcspkr power_meter
> button processor acpi_processor ext4 mbcache jbd2 crc16 dm_mod sd_mod
> crc_t10dif sg usbhid hid sr_mod cdrom usb_storage ata_generic ehci_hcd
> ata_piix mpt2sas bnx2 scsi_transport_sas libata usbcore nls_base scsi_mod
> thermal thermal_sys [last unloaded: scsi_wait_scan]
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.962493] Pid: 7617, comm: tapdisk2
> Not tainted 2.6.32-5-xen-amd64 #1 PowerEdge T310
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.962566] RIP:
> e030:[<ffffffff810ce79e>]  [<ffffffff810ce79e>]
> apply_to_page_range+0x47/0x2f3
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.962671] RSP: e02b:ffff88003dfc9b58 
> EFLAGS: 00010202
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.962726] RAX: 0000000000000880 RBX:
> ffff88003d8ad000 RCX: ffff88003d8ae000
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.962783] RDX: 0000000000000000 RSI:
> ffff88003d8ad000 RDI: 0000000000000000
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.962840] RBP: ffff88003eff2dd0 R08:
> 0000000000000000 R09: ffff88003f96c180
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.962898] R10: 0000000000000002 R11:
> 0000000000000000 R12: 0000000000000000
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.962955] R13: ffff88003eff2dd0 R14:
> ffff88003e149000 R15: 0000000000000000
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.963014] FS: 
> 00007f6d3c738740(0000) GS:ffff880003782000(0000) knlGS:0000000000000000
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.963088] CS:  e033 DS: 0000 ES:
> 0000 CR0: 000000008005003b
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.963143] CR2: 0000000000000048 CR3:
> 000000003f0dc000 CR4: 0000000000002660
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.963201] DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.963258] DR3: 0000000000000000 DR6:
> 00000000ffff0ff0 DR7: 0000000000000400
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.963316] Process tapdisk2 (pid:
> 7617, threadinfo ffff88003dfc8000, task ffff880036a88e20)
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.963389] Stack:
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.963437]  0000000000000000
> ffff88003ea87b40 0000000000000000 0000000000000000
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.963583] <0> ffffffffa02f1ee8
> 0000000000000000 ffffffff8100ece2 ffff880002155480
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.963802] <0> ffff88003d8ae000
> 0000000000000000 0000000000000000 ffff880002155480
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964066] Call Trace:
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964119]  [<ffffffffa02f1ee8>] ?
> blktap_umap_uaddr_fn+0x0/0x59 [blktap]
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964179]  [<ffffffff8100ece2>] ?
> check_events+0x12/0x20
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964236]  [<ffffffffa02f32a5>] ?
> blktap_device_end_request+0xbd/0x145 [blktap]
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964310]  [<ffffffffa02f1743>] ?
> blktap_ring_vm_close+0x60/0xd1 [blktap]
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964368]  [<ffffffff810d13f8>] ?
> remove_vma+0x2c/0x72
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964423]  [<ffffffff810d1567>] ?
> exit_mmap+0x129/0x148
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964479]  [<ffffffff8104cc5d>] ?
> mmput+0x3c/0xdf
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964534]  [<ffffffff81050862>] ?
> exit_mm+0x102/0x10d
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964592]  [<ffffffff8130d0d2>] ?
> _spin_lock_irq+0x7/0x22
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964648]  [<ffffffff81052287>] ?
> do_exit+0x1f8/0x6c6
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964703]  [<ffffffff8105d5a1>] ?
> __dequeue_signal+0xfb/0x124
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964760]  [<ffffffff8100eccf>] ?
> xen_restore_fl_direct_end+0x0/0x1
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964817]  [<ffffffff810e7f35>] ?
> kmem_cache_free+0x72/0xa3
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964874]  [<ffffffff810527cb>] ?
> do_group_exit+0x76/0x9d
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964930]  [<ffffffff8105f0b7>] ?
> get_signal_to_deliver+0x310/0x339
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.964987]  [<ffffffff8101104f>] ?
> do_notify_resume+0x87/0x73f
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.965044]  [<ffffffff810d15e1>] ?
> expand_downwards+0x5b/0x169
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.965101]  [<ffffffff8130f589>] ?
> do_page_fault+0x1f3/0x2f2
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.965157]  [<ffffffff810125dc>] ?
> retint_signal+0x48/0x8c
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.965211] Code: 48 89 4c 24 20 4c 89
> 44 24 18 48 89 54 24 40 72 04 0f 0b eb fe 48 8b 54 24 28 48 89 f0 48 8b 4c
> 24 40 48 c1 e8 24 25 f8 0f 00 00 <48> 8b 52 48 48 ff c9 48 89 0c 24 48 01 d0
> 48 89 44 24 30 48 b8 
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.967243] RIP  [<ffffffff810ce79e>]
> apply_to_page_range+0x47/0x2f3
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.967336]  RSP <ffff88003dfc9b58>
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.967388] CR2: 0000000000000048
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.967440] ---[ end trace
> 78b5f16c10850a91 ]---
> Sep 14 16:20:13 heliMN02WV kernel: [ 1004.967495] Fixing recursive fault but
> reboot is needed!
> 
> 
> Rebooting the systems doesn't resolve this problem.
> We have also try to add swiotlb=128 on vmlinuz line but the systems always
> loops with "out sw-iommu space" message (probably also bug in swiotlb switch
> on kernel).
> Can someone help us to solve this problem please?
> 
> 
> --
> View this message in context: 
> http://xen.1045712.n5.nabble.com/Out-sw-iommu-space-problem-tp4803078p4803078.html
> Sent from the Xen - Dev mailing list archive at Nabble.com.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.