[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Network drop to domU (netfront: rx->offset: 0, size: 4294967295)
Hello, I am the administrator of a fairly big Xen envirioment and i have run into a bug. At random points some dom0's loose their network connection for about 1 ~ 2 minutes and in their kernel log the following comes up: [548994.957487] printk: 56 messages suppressed. [548994.957508] netfront: rx->offset: 0, size: 4294967295 [548994.957511] netfront: rx->offset: 0, size: 4294967295 The dom0 specs: - 2x Intel(R) Xeon(R) CPU E5420 - 64GB DDR2 FB-DIMM - 2x Intel 80003ES2LAN - SuperMicro X7DB8 mainboard - Areca ARC-1680ix-16 RAID Controller This is a Ubuntu 8.04.2 system with Xen 3.2.1-rc1-pre installed from the Ubuntu repositories. The kernel used here is a customized kernel (2.6.24-24-xen) based on the Ubuntu source, NR_DYNIRQS has been raised from 256 to 1024 to support more domU's. At the moment this server is hosting about 110 domU's. In my "xm dmesg" i get the following messages: (XEN) grant_table.c:1262:d0 Bad flags (0) or dom (0). (expected dom 0) This message is reported about 1000 times in a few days. I have two of these machines running, they are identical in both software in hardware, the only difference is the fact that one server hosts 110 domU's and the other hosts about 20 domU's. This behaviour is only seen the the machine hosting the 110 domU's. At first i thought this had to do something with my Intel NIC, but at the moment the domU becomes unavailable the dom0 is still available, so it seems to go wrong somewhere inside the netfront. (That is what Google told me). One of the tests i did was disabling TSO, RX en TX with ethtool in both the dom0 and the domU, but this did not have any effect, the messages keep coming. To me this issue seems related to the large number of domU's running on this system, especially since the other identical machine is not effected. I took the kernel source and started looking where the netfront messages was being printed and it seemed some kind of memory allocation issue? I have found some old messages with patches but those did not apply to my current source. Since this is a running production system i can schedule a reboot for a new kernel, but this takes some time. - Met vriendelijke groet, Wido den Hollander Hoofd Systeembeheer / CSO Telefoon Support Nederland: 0900 9633 (45 cpm) Telefoon Support BelgiÃ: 0900 70312 (45 cpm) Telefoon Direct: (+31) (0)20 50 60 104 Fax: +31 (0)20 50 60 111 E-mail: support@xxxxxxxxxxxx Website: http://www.pcextreme.nl Kennisbank: http://support.pcextreme.nl/ Netwerkstatus: http://nmc.pcextreme.nl Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |