[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] bnx2x DMA mapping errors cause iscsi problems
Hi Malcolm, Thank you for your answer !We already tried to tune serveral parameters in order to find a workaround for our concern: - swiotlb size:We didn't increase the swiotlb size. We found a similar case found on the Citrix forums ( http://discussions.citrix.com/topic/324343-xenserver-61-bnx2x-sw-iommu/ ): using swiotlb=256 did not help. So we didn't try ourselves. Unfortunately, there is no mention of a solution in that thread, only a patch for the bnx2x driver (Driver Disk for Broadcom bnx2x driver v1.74.22 for XenServer 6.1.0 with Hotfix XS61E018) but I have to verify if it is related to our problem. I have to mention that we have no error messages about "Out of SW-IOMMU space" but this can be due the verbosity of the driver or the kernel. - disable_tpa=1this is already the case by disabling LRO (correct ?). Here is the output of ethtool: root@xen2-pyth:~# ethtool -k eth4 Features for eth4: rx-checksumming: on tx-checksumming: on tx-checksum-ipv4: on tx-checksum-unneeded: off [fixed] tx-checksum-ip-generic: off [fixed] tx-checksum-ipv6: on tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: on tx-tcp6-segmentation: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off rx-vlan-offload: on [fixed] tx-vlan-offload: on ntuple-filters: off [fixed] receive-hashing: on highdma: on [fixed] rx-vlan-filter: off [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: on loopback: off - reducing the queues.We reduced the queues to 4 (default was 11). When the problems happened this week, we modified again the parameter dynamically to num_queues=1. We were then able to go on without rebooting the hypervisor. No more messages 'Can't map rx data' till now... but for how long ? Setting the number of queues as low as 1 could have a long term effect ? I've read the draft you wrote to solve the problem. As far as I understand (because this a very complex for me), this could be the root cause of our problem. But how can we monitor the different parameters (DMA, SW-IOMMU space, ...) when we have this problem to validate this assumption ? BTW what is the time frame for implementing the proposed solution in your draft ? We run version 4.1.4 of Xen : are there improvements related to this problem in newer versions ? Regards, Patrick _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |