[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-users] Domain Crash and Xend can't restart
I have a single VM (of 11) that has a recurring problem. This image has moved from machine to machine, with the problem following it. This image has been rebuilt from scratch, and the problem recurred. It would appear that there is something in the behaviour of this VM which causes it to crash and causes Xend to become unhappy. The problem presents as: Domain crashes, becomes zombie. xm destroy will not destroy the zombie. xm create will not start it or any other domain (Hotplug Scripts not working) The only solution appears to be a reboot of the host machine. Stopping and restarting xend/xendomains does not solve the problem. This particular VM is our continuous build system. It is building code pretty much all day long, and does very heavy NFS ops. The host machine is using Xen 3.0.2 running on FC5 2.6.17-1.2174_FC5xen0 (using the yum packages). The guest OS is FC4 with 2.6.17-1.2174_FC5xenU The problem only presents itself on this VM. It is actually an identical copy to the other 11 VMs, all of which are development boxes using NFS. The issue appears to occur only due to the volume of work the problem image does. One of the bits of help I need is in knowing where to get the information necessary to solve the problem. I've attached the bit of the xend.log that involves the crash and subsequent failed restarts. # xm info host : pdev0 release : 2.6.17-1.2174_FC5xen0 version : #1 SMP Tue Aug 8 16:26:11 EDT 2006 machine : x86_64 nr_cpus : 2 nr_nodes : 1 sockets_per_node : 2 cores_per_socket : 1 threads_per_core : 1 cpu_mhz : 2390 hw_caps : 00000000:00000000:078bfbff:e3d3fbff:00000000:00000010:00000001 total_memory : 8128 free_memory : 1413 xen_major : 3 xen_minor : 0 xen_extra : -unstable xen_caps : xen-3.0-x86_64 platform_params : virt_start=0xffff800000000000 xen_changeset : unavailable cc_compiler : gcc version 4.1.1 20060525 (Red Hat 4.1.1-1) cc_compile_by : brewbuilder cc_compile_domain : build.redhat.com cc_compile_date : Tue Aug 8 15:25:03 EDT 2006 # cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 37 model name : AMD Opteron(tm) Processor 250 stepping : 1 cpu MHz : 2390.648 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm bogomips : 5978.35 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 37 model name : AMD Opteron(tm) Processor 250 stepping : 1 cpu MHz : 2390.648 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm bogomips : 5978.35 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp (The machine has been bounced since I had to get the image back in service, I don't know how useful this will be) # xm dmesg __ __ _____ ___ _ _ _ \ \/ /___ _ __ |___ / / _ \ _ _ _ __ ___| |_ __ _| |__ | | ___ \ // _ \ '_ \ |_ \| | | |__| | | | '_ \/ __| __/ _` | '_ \| |/ _ \ / \ __/ | | | ___) | |_| |__| |_| | | | \__ \ || (_| | |_) | | __/ /_/\_\___|_| |_| |____(_)___/ \__,_|_| |_|___/\__\__,_|_.__/|_|\___| http://www.cl.cam.ac.uk/netos/xen University of Cambridge Computer Laboratory Xen version 3.0-unstable (brewbuilder@xxxxxxxxxxxxxxxx) (gcc version 4.1.1 20060525 (Red Hat 4.1.1-1)) Tue Aug 8 15:25:03 EDT 2006 Latest ChangeSet: unavailable (XEN) Command line: /boot/xen.gz-2.6.17-1.2174_FC5 (XEN) Physical RAM map: (XEN) 0000000000000000 - 000000000009a000 (usable) (XEN) 000000000009a000 - 00000000000a0000 (reserved) (XEN) 00000000000d0000 - 0000000000100000 (reserved) (XEN) 0000000000100000 - 00000000fbf70000 (usable) (XEN) 00000000fbf70000 - 00000000fbf77000 (ACPI data) (XEN) 00000000fbf77000 - 00000000fbf80000 (ACPI NVS) (XEN) 00000000fbf80000 - 00000000fc000000 (reserved) (XEN) 00000000fec00000 - 00000000fec00400 (reserved) (XEN) 00000000fee00000 - 00000000fee01000 (reserved) (XEN) 00000000fff80000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 0000000200000000 (usable) (XEN) System RAM: 8127MB (8322088kB) (XEN) Xen heap: 13MB (14020kB) (XEN) Using scheduler: SMP Credit Scheduler (credit) (XEN) found SMP MP-table at 000f7de0 (XEN) DMI present. (XEN) Using APIC driver default (XEN) ACPI: RSDP (v002 PTLTD ) @ 0x00000000000f7db0 (XEN) ACPI: XSDT (v001 PTLTD XSDT 0x06040000 LTP 0x00000000) @ 0x00000000fbf74bd4 (XEN) ACPI: FADT (v003 SUN V20z 0x06040000 PTEC 0x000f4240) @ 0x00000000fbf76c0c (XEN) ACPI: HPET (v001 Sun V20z 0x06040000 PTEC 0x00000000) @ 0x00000000fbf76d00 (XEN) ACPI: MADT (v001 PTLTD APIC 0x06040000 LTP 0x00000000) @ 0x00000000fbf76d38 (XEN) ACPI: SPCR (v001 PTLTD $UCRTBL$ 0x06040000 PTL 0x00000001) @ 0x00000000fbf76dae (XEN) ACPI: SSDT (v001 SUN V20z 0x06040000 LTP 0x00000001) @ 0x00000000fbf76dfe (XEN) ACPI: SSDT (v001 SUN V20z 0x06040000 LTP 0x00000001) @ 0x00000000fbf76e9b (XEN) ACPI: SRAT (v001 SUN V20z 0x06040000 SUN 0x00000001) @ 0x00000000fbf76f38 (XEN) ACPI: DSDT (v001 Sun V20z 0x06040000 MSFT 0x0100000e) @ 0x0000000000000000 (XEN) ACPI: Local APIC address 0xfee00000 (XEN) ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) (XEN) Processor #0 15:5 APIC version 16 (XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) (XEN) Processor #1 15:5 APIC version 16 (XEN) ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) (XEN) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) (XEN) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) (XEN) IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23 (XEN) ACPI: IOAPIC (id[0x03] address[0xfd000000] gsi_base[24]) (XEN) IOAPIC[1]: apic_id 3, version 17, address 0xfd000000, GSI 24-27 (XEN) ACPI: IOAPIC (id[0x04] address[0xfd001000] gsi_base[28]) (XEN) IOAPIC[2]: apic_id 4, version 17, address 0xfd001000, GSI 28-31 (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge) (XEN) ACPI: IRQ0 used by override. (XEN) ACPI: IRQ2 used by override. (XEN) Enabling APIC mode: Flat. Using 3 I/O APICs (XEN) ACPI: HPET id: 0x102282a0 base: 0xfed00000 (XEN) Using ACPI (MADT) for SMP configuration information (XEN) Initializing CPU#0 (XEN) Detected 2390.648 MHz processor. (XEN) CPU0: AMD Flush Filter disabled (XEN) CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) (XEN) CPU: L2 Cache: 1024K (64 bytes/line) (XEN) Intel machine check architecture supported. (XEN) Intel machine check reporting enabled on CPU#0. (XEN) CPU0: AMD Opteron(tm) Processor 250 stepping 01 (XEN) Booting processor 1/1 eip 90000 (XEN) Initializing CPU#1 (XEN) CPU1: AMD Flush Filter disabled (XEN) CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) (XEN) CPU: L2 Cache: 1024K (64 bytes/line) (XEN) AMD: Disabling C1 Clock Ramping Node #0 (XEN) AMD: Disabling C1 Clock Ramping Node #1 (XEN) Intel machine check architecture supported. (XEN) Intel machine check reporting enabled on CPU#1. (XEN) CPU1: AMD Opteron(tm) Processor 250 stepping 01 (XEN) Total of 2 processors activated. (XEN) ENABLING IO-APIC IRQs (XEN) -> Using new ACK method (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=0 pin2=0 (XEN) checking TSC synchronization across 2 CPUs: passed. (XEN) Platform timer is 14.318MHz HPET (XEN) Brought up 2 CPUs (XEN) Machine check exception polling timer started. (XEN) *** LOADING DOMAIN 0 *** (XEN) Domain 0 kernel supports features = { 0000000f }. (XEN) Domain 0 kernel requires features = { 00000000 }. (XEN) PHYSICAL MEMORY ARRANGEMENT: (XEN) Dom0 alloc.: 000000000e000000->0000000010000000 (2010971 pages to be allocated) (XEN) VIRTUAL MEMORY ARRANGEMENT: (XEN) Loaded kernel: ffffffff80200000->ffffffff80619108 (XEN) Init. ramdisk: ffffffff8061a000->ffffffff808db000 (XEN) Phys-Mach map: ffffffff808db000->ffffffff81842ad8 (XEN) Start info: ffffffff81843000->ffffffff81844000 (XEN) Page tables: ffffffff81844000->ffffffff81855000 (XEN) Boot stack: ffffffff81855000->ffffffff81856000 (XEN) TOTAL: ffffffff80000000->ffffffff81c00000 (XEN) ENTRY ADDRESS: ffffffff80200000 (XEN) Dom0 has maximum 2 VCPUs (XEN) Initrd len 0x2c1000, start at 0xffffffff8061a000 (XEN) Scrubbing Free RAM: ............................................................................ ......done. (XEN) Xen trace buffers: disabled (XEN) Xen is relinquishing VGA console. (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen). --- Any help would be apprectiated. Attachment:
xend.log _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |