[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] bnx2x DMA mapping errors cause iscsi problems
Hi Patrick, Sorry this email won't help resolve your issue but I'm highlighting that a design I have will help resolve this problem in the long run. On 23/04/14 09:07, Patrick Vranckx wrote: > Hi, > > We are running open source Xen 4.1.4 on Debian 7.4 amd64 > HW is HP BL460c Gen8. Nic is Broadcom Corporation NetXtreme II BCM57810 > 10 Gigabit Ethernet (rev 10) > > We are experiencing sporadic network blackouts after a few days on eth4 > which is used for iscsi block storage for the VMs. VMs file sytems are > switching to read-only and so we loose all the VMs. Then we have to > reboot the hypervisor to regain network connectivity. > > MTU = 9000 on eth4. > > We were using Broadcom kernel driver from Debian 7.4 official kernel > (3.2.54-2). Now we've updated with the latest driver published on > Broadcom website, we have some more login : > > [1200406.207855] [bnx2x_alloc_rx_data:1009(eth4)]Can't map rx data > [1200406.207978] [bnx2x_alloc_rx_data:1009(eth4)]Can't map rx data > ..... > This is exactly the issue that the linked design is trying to address: http://lists.xen.org/archives/html/xen-devel/2014-04/msg01632.html > Here are bnx2x module versions we tried : > Debian 7.4 stock kernel : 1.70.30 > Broadcom website (latest) : 1.78.58 > Broadcom Firmware : Latest from HP BROADCOM 2.9.26 CP021537 package > > Looking at bnx2x source code (bnx2x_cmn.c), it appears this error is > caused by a DMA mapping error for rx buffers (memory leak ?) > > static int bnx2x_alloc_rx_data(struct bnx2x *bp, struct bnx2x_fastpath *fp, > u16 index, gfp_t gfp_mask) > { > .... > mapping = dma_map_single(&bp->pdev->dev, data + NET_SKB_PAD, > fp->rx_buf_size, > DMA_FROM_DEVICE); > > if (unlikely(dma_mapping_error(&bp->pdev->dev, mapping))) { > > #ifdef BCM_HAS_BUILD_SKB /* BNX2X_UPSTREAM */ > bnx2x_frag_free(fp, data); > #else > dev_kfree_skb_any(data); > #endif > BNX2X_ERR("Can't map rx data\n"); > return -ENOMEM; > } > ... > > We found several other references of people suffering from the same > problem. > Here are two threads concerning Citrix XenServer 6 showing the exact > same problem on BL460C G6 and Gen8 > http://discussions.citrix.com/topic/324343-xenserver-61-bnx2x-sw-iommu/ > http://discussions.citrix.com/topic/333281-xenserver-62-crash-bug/page-3 > > It seems from other references that most of the time, similar problems > occuring with this driver are related to virtualized environments. > > We found a rather old workaround from VMWare. The solution is to reduce > the number of queues used by the driver (num_queues parameter). > Unfortunately, the problem still occurs but after a longer period. Reducing the queues, increasing the swiotlb size (as Jan suggested) and you can try adding "disable_tpa=1" to bnx2x module parameters to work around this issue. There will be a potential reduction in network performance from these parameters. > > There are threads in this mailing list related to DMA allocation in Xen > ( http://markmail.org/message/uududlw5w6xlqcp2 ) but I'm not able to > understand if those threads are related to our problem. > > Thanks for your help, > > Patrick > Malcolm _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |