[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Some kind of hardware issue



I have my answer. The hosting company has stated that both disks are
defected. It seems like every time I've cried hardware failure before
its always been something else and having both disks with issues
seemed stupid but, according to the hosting company I have two
defective disks.

Thanks for the help all.

Regards,

Daniel

On Sat, Mar 2, 2013 at 12:51 PM, Tony Lill <ajlill@xxxxxxxxxxxxxxxxxxx> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> I ran into strange disk issues when I tried to upgrade to the pvops
> kernels on  two on my amd motherboards. On one of them, all I had to
> do was try to copy a large file and  I'd get a slew of disk errors. I
> went back to xen-3.x and a xenified kernel (from opensuse) and all was
> well.
>
> See http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1806
>
> On 03/01/2013 05:55 PM, Daniel Hood wrote:
>> Hi all,
>>
>> First time posting.
>>
>> So the story goes I bought a dedicated box from a hosting company.
>> Opteron 1218 and MSI motherboard (Not sure which model exactly,
>> LSPCI output is below). I've tried installing Debian 6, Ubuntu
>> 12.04 and CentOS 6 then install Xen 4 hypervisor on them at
>> different stages of this issue. All three boot their normal kernels
>> perfectly. Can't seem to find any errors related.
>>
>> I then try to boot into my Xen kernel and these are the errors I'm
>> getting: http://i.imgur.com/LHq7KCH.png
>> http://i.imgur.com/fEfnm0I.png
>>
>> I've tried booting back into the normal kernel's and shit works.
>> I've tried adding 'noacpi', 'acpi=off' and 'libata.force=noncq' on
>> both the kernel and the module lines. No idea what else to try. Any
>> ideas anyone?
>>
>> Here is the outputs from the CentOS attempts:
>>
>> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>  [root@virt-host01 init.d]# cat /boot/grub/grub.conf # # Hetzner
>> Online AG - installimage # GRUB bootloader configuration file #
>>
>> timeout 5 default 0
>>
>> title CentOS (3.7.10-1.el6xen.x86_64) root (hd0,1) kernel /xen.gz
>> dom0_mem=1024M cpufreq=xen dom0_max_vcpus=1 dom0_vcpus_pin module
>> /boot/vmlinuz-3.7.10-1.el6xen.x86_64 ro root=/dev/md1 rd_NO_LUKS
>> rd_NO_DM nomodeset SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8
>> KEYTABLE=de module /boot/initramfs-3.7.10-1.el6xen.x86_64.img
>>
>> title CentOS (3.7.10-1.el6xen.x86_64) root (hd0,1) kernel
>> /boot/vmlinuz-3.7.10-1.el6xen.x86_64 ro root=/dev/md1 rd_NO_LUKS
>> rd_NO_DM nomodeset SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8
>> KEYTABLE=de initrd /boot/initramfs-3.7.10-1.el6xen.x86_64.img
>>
>> title CentOS (2.6.32-279.22.1.el6.x86_64) root (hd0,1) kernel
>> /boot/vmlinuz-2.6.32-279.22.1.el6.x86_64 ro root=/dev/md1
>> rd_NO_LUKS rd_NO_DM nomodeset initrd
>> /boot/initramfs-2.6.32-279.22.1.el6.x86_64.img
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>  LSPCI output:
>>
>> 00:00.0 RAM memory: NVIDIA Corporation C51 Host Bridge (rev a2)
>> 00:00.1 RAM memory: NVIDIA Corporation C51 Memory Controller 0 (rev
>> a2) 00:00.2 RAM memory: NVIDIA Corporation C51 Memory Controller 1
>> (rev a2) 00:00.3 RAM memory: NVIDIA Corporation C51 Memory
>> Controller 5 (rev a2) 00:00.4 RAM memory: NVIDIA Corporation C51
>> Memory Controller 4 (rev a2) 00:00.5 RAM memory: NVIDIA Corporation
>> C51 Host Bridge (rev a2) 00:00.6 RAM memory: NVIDIA Corporation C51
>> Memory Controller 3 (rev a2) 00:00.7 RAM memory: NVIDIA Corporation
>> C51 Memory Controller 2 (rev a2) 00:02.0 PCI bridge: NVIDIA
>> Corporation C51 PCI Express Bridge (rev a1) 00:03.0 PCI bridge:
>> NVIDIA Corporation C51 PCI Express Bridge (rev a1) 00:04.0 PCI
>> bridge: NVIDIA Corporation C51 PCI Express Bridge (rev a1) 00:05.0
>> VGA compatible controller: NVIDIA Corporation C51 [Quadro NVS
>> 210S/GeForce 6150LE] (rev a2) 00:09.0 RAM memory: NVIDIA
>> Corporation MCP51 Host Bridge (rev a2) 00:0a.0 ISA bridge: NVIDIA
>> Corporation MCP51 LPC Bridge (rev a3) 00:0a.1 SMBus: NVIDIA
>> Corporation MCP51 SMBus (rev a3) 00:0b.0 USB controller: NVIDIA
>> Corporation MCP51 USB Controller (rev a3) 00:0b.1 USB controller:
>> NVIDIA Corporation MCP51 USB Controller (rev a3) 00:0d.0 IDE
>> interface: NVIDIA Corporation MCP51 IDE (rev a1) 00:0e.0 IDE
>> interface: NVIDIA Corporation MCP51 Serial ATA Controller (rev a1)
>> 00:0f.0 IDE interface: NVIDIA Corporation MCP51 Serial ATA
>> Controller (rev a1) 00:10.0 PCI bridge: NVIDIA Corporation MCP51
>> PCI Bridge (rev a2) 00:14.0 Bridge: NVIDIA Corporation MCP51
>> Ethernet Controller (rev a3) 00:18.0 Host bridge: Advanced Micro
>> Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology
>> Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8
>> [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro
>> Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host
>> bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
>> Miscellaneous Control
>>
>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>  Smartctl output:
>>
>> root@rescue ~ # smartctl -a /dev/sda smartctl 5.41 2011-06-09 r3365
>> [x86_64-linux-3.4.28] (local build) Copyright (C) 2002-11 by Bruce
>> Allen, http://smartmontools.sourceforge.net
>>
>> === START OF INFORMATION SECTION === Model Family:     SAMSUNG
>> SpinPoint T166 Device Model:     SAMSUNG HD321KJ Serial Number:
>> S0MQJDQP603258 LU WWN Device Id: 5 0000f0 0db603258 Firmware
>> Version: CP100-10 User Capacity:    320,072,933,376 bytes [320 GB]
>> Sector Size:      512 bytes logical/physical Device is:        In
>> smartctl database [for details use: -P show] ATA Version is:   8
>> ATA Standard is:  ATA-8-ACS revision 3b Local Time is:    Fri Mar
>> 1 23:50:44 2013 CET SMART support is: Available - device has SMART
>> capability. SMART support is: Enabled
>>
>> === START OF READ SMART DATA SECTION === SMART overall-health
>> self-assessment test result: PASSED
>>
>> General SMART Values: Offline data collection status:  (0x00)
>> Offline data collection activity was never started. Auto Offline
>> Data Collection: Disabled. Self-test execution status:      (  41)
>> The self-test routine was interrupted by the host with a hard or
>> soft reset. Total time to complete Offline data collection:
>> ( 5746) seconds. Offline data collection capabilities:
>> (0x5b) SMART execute Offline immediate. Auto Offline data
>> collection on/off supp ort. Suspend Offline collection upon new
>> command. Offline surface scan supported. Self-test supported. No
>> Conveyance Self-test supported. Selective Self-test supported.
>> SMART capabilities:            (0x0003) Saves SMART data before
>> entering power-saving mode. Supports SMART auto save timer. Error
>> logging capability:        (0x01) Error logging supported. General
>> Purpose Logging supported. Short self-test routine recommended
>> polling time:        (   2) minutes. Extended self-test routine
>> recommended polling time:        (  97) minutes. SCT capabilities:
>> (0x003f) SCT Status supported. SCT Error Recovery Control
>> supported. SCT Feature Control supported. SCT Data Table
>> supported.
>>
>> SMART Attributes Data Structure revision number: 16 Vendor Specific
>> SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME          FLAG
>> VALUE WORST THRESH TYPE UPDATED  WHEN_ FAILED RAW_VALUE 1
>> Raw_Read_Error_Rate     0x000f   100   100   051    Pre-fail Always
>> - 1 3 Spin_Up_Time            0x0007   100   100   015    Pre-fail
>> Always       - 5696 4 Start_Stop_Count        0x0032   100   100
>> 000    Old_age Always       - 21 5 Reallocated_Sector_Ct   0x0033
>> 253   253   010    Pre-fail Always       - 0 7 Seek_Error_Rate
>> 0x000f   253   253   051    Pre-fail Always       - 0 8
>> Seek_Time_Performance   0x0025   253   253   015    Pre-fail
>> Offline      - 0 9 Power_On_Hours          0x0032   100   100   000
>> Old_age Always       - 6877 10 Spin_Retry_Count        0x0033   253
>> 253   051    Pre-fail Always       - 0 11 Calibration_Retry_Count
>> 0x0012   253   253   000    Old_age Always       - 0 12
>> Power_Cycle_Count       0x0032   100   100   000    Old_age Always
>> - 21 187 Reported_Uncorrect      0x0032   253   253   000
>> Old_age Always       - 0 188 Command_Timeout         0x0032   253
>> 253   000    Old_age Always       - 0 190 Airflow_Temperature_Cel
>> 0x0022   064   060   000    Old_age Always       - 36 194
>> Temperature_Celsius     0x0022   130   118   000    Old_age Always
>> - 36 195 Hardware_ECC_Recovered  0x001a   100   100   000
>> Old_age Always       - 461965997 196 Reallocated_Event_Count 0x0032
>> 253   253   000    Old_age Always       - 0 197
>> Total_Pending_Sectors   0x0012   253   253   000    Old_age Always
>> - 0 198 Offline_Uncorrectable   0x0030   253   253   000
>> Old_age Offline      - 0 199 UDMA_CRC_Error_Count    0x003e   200
>> 200   000    Old_age Always       - 0 200 Multi_Zone_Error_Rate
>> 0x000a   100   100   000    Old_age Always       - 0 201
>> Soft_Read_Error_Rate    0x000a   100   100   000    Old_age Always
>> - 0 202 Data_Address_Mark_Errs  0x0032   253   253   000
>> Old_age Always       - 0
>>
>> SMART Error Log Version: 1 No Errors Logged
>>
>> SMART Self-test log structure revision number 1 Num
>> Test_Description    Status                  Remaining
>> LifeTime(hours)  LBA _of_first_error # 1  Extended offline
>> Interrupted (host reset)      90%      6876         - # 2  Extended
>> offline    Interrupted (host reset)      90%      6869         - #
>> 3  Extended offline    Interrupted (host reset)      90%      6868
>> - # 4  Extended offline    Completed without error       00%
>> 6499         -
>>
>> Note: selective self-test log revision number (0) not 1 implies
>> that no selectiv e self-test has ever been run SMART Selective
>> self-test log data structure revision number 0 Note: revision
>> number not 1 implies that no selective self-test has ever been ru
>> n SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS 1        0        0
>> Not_testing 2        0        0  Not_testing 3        0        0
>> Not_testing 4        0        0  Not_testing 5        0        0
>> Not_testing Selective self-test flags (0x0): After scanning
>> selected spans, do NOT read-scan remainder of disk. If Selective
>> self-test is pending on power-up, resume after 0 minute delay.
>>
>> _______________________________________________ Xen-users mailing
>> list Xen-users@xxxxxxxxxxxxx http://lists.xen.org/xen-users
>>
>
> - --
> Tony Lill, OCT,                    Tony.Lill@xxxxxxxxxxxxxxxxxxx
> President, A. J. Lill Consultants                 (519) 650 0660
> 539 Grand Valley Dr., Cambridge, Ont. N3H 2S2     (519) 241 2461
> - --------------- http://www.ajlc.waterloo.on.ca/ ----------------
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with undefined - http://www.enigmail.net/
>
> iEYEARECAAYFAlExWz8ACgkQGS8yZq1uvxA6RACePHzUrdXxFElp2IllVxvx86ej
> 3IEAn1CNRtuV5Dv6oBwPtK5j7VHopdOl
> =HrpZ
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxx
> http://lists.xen.org/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.