[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-ia64-devel] [Fwd: [Xen-bugs] [Bug 1392] New: Problems with denormalized floating point numbers on XEN-virtualized Linux/IA64]



This looks pretty nasty and it still occurs on latest upstream.  The
test program in the bugzilla usually shows the problem within a couple
runs.  Thanks,

Alex


-------- Forwarded Message --------
> From: bugzilla-daemon@xxxxxxxxxxxxxxxxxxx
> Reply-to: bugs@xxxxxxxxxxxxxxxxxx
> To: xen-bugs@xxxxxxxxxxxxxxxxxxx
> Subject: [Xen-bugs] [Bug 1392] New: Problems with denormalized
> floating point numbers on XEN-virtualized Linux/IA64
> Date: Wed, 3 Dec 2008 08:34:59 -0800
> 
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1392
> 
>            Summary: Problems with denormalized floating point numbers on
>                     XEN-virtualized Linux/IA64
>            Product: Xen
>            Version: 3.0.3
>           Platform: IA64
>         OS/Version: Linux
>             Status: NEW
>           Severity: normal
>           Priority: P2
>          Component: Unspecified
>         AssignedTo: xen-bugs@xxxxxxxxxxxxxxxxxxx
>         ReportedBy: volker.simonis@xxxxxxxxx
>                 CC: volker.simonis@xxxxxxxxx
> 
> 
> Hi,
> 
> while we were testing our Java VM on a XEN-virtualized Linux/IA64 we
> encountered some non-deterministic, but reproducible floating point failures.
> After some debugging we could exclude the Java VM as the root cause of the
> problem and came up with the following small C++ test case, which usually 
> fails
> on a virtualized Linux box. We couldn't however reproduce the failure on any
> other, non-virtualized IA64 Linux.
> 
> Attached you can find the test program "fnorms.cpp". Please compile with 'gcc
> -g fnorms.cpp'. The program will silently finish if no error occurs, otherwise
> it will print one or more lines like: "ERROR: 1.401298e-45 != 1.000000e+00".
> 
> During our debugging sessions, we observed that the reason for the failure is
> that certain IA64 floating point instructions like 'fnorm.s', 'fmpy.s' or
> 'fcmp' may fail if they are applied to denormalized floating point values.
> 
> If the multiplication ('fmpy.s') fails, the error line shows different numbers
> (e.g. "ERROR: 1.401298e-45 != 1.000000e+00"). But thre's also a case where the
> compare fails (i.e. the 'fcmp' which was generated for "if (result !=
> min_float)"). If this happens, the test program "erroneously" reports that the
> result of the multiplication and the initial value of "min_float" differ
> ("ERROR: 1.401298e-45 != 1.401298e-45"), although the two numbers are really
> equal.
> 
> Because we have only observed these failures on a Xen-virtualized IA64-Linux
> version (in both, dom0 and dom1) our assumption is that there may be a problem
> in the implementation of the Floating Point Software Assistance (FPSWA) in 
> Xen,
> because all of the above mentioned instructions generate a "floating-point
> assist fault" if they are applied to denormalized values (as can be seen in
> "/var/log/messages").  This is only a vague guess however...
> 
> Has anybody seen these problems before or are there any ideas why this 
> happens?
> 
> With best regards,
> Volker
> 
> PS: we have tested on:
> 
> Xen: 3.0.3-64
> 
> dom0: RHEL 5.2
> ---------------
> [root@xxxxxx ~]# uname -a
> Linux xxxxxx.wdf.sap.corp 2.6.18-92.el5xen #1 SMP Tue Apr 29 13:36:07 EDT 2008
> ia64 ia64 ia64 GNU/Linux
> [root@xxxxxx ~]# lsb_release -a
> LSB Version:   
> :core-3.1-ia64:core-3.1-noarch:graphics-3.1-ia64:graphics-3.1-noarch
> Distributor ID: RedHatEnterpriseServer
> Description:    Red Hat Enterprise Linux Server release 5.2 (Tikanga)
> Release:        5.2
> Codename:       Tikanga
> [root@xxxxxx ~]# rpm -q xen
> xen-3.0.3-64.el5_2.1
> [root@xxxxxx ~]# xm info
> host                   : xxxxxx.wdf.sap.corp
> release                : 2.6.18-92.el5xen
> version                : #1 SMP Tue Apr 29 13:36:07 EDT 2008
> machine                : ia64
> nr_cpus                : 4
> nr_nodes               : 1
> sockets_per_node       : 2
> cores_per_socket       : 2
> threads_per_core       : 1
> cpu_mhz                : 1594
> hw_caps                :
> 00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:
> total_memory           : 32722
> free_memory            : 17277
> node_to_cpu            : node0:no cpus
> xen_major              : 3
> xen_minor              : 1
> xen_extra              : .2-92.el5
> xen_caps               : xen-3.0-ia64 xen-3.0-ia64be hvm-3.0-ia64 
> xen_pagesize           : 16384
> platform_params        : virt_start=0xe800000000000000
> xen_changeset          : unavailable
> cc_compiler            : gcc version 4.1.2 20071124 (Red Hat 4.1.2-41)
> cc_compile_by          : brewbuilder
> cc_compile_domain      : redhat.com
> cc_compile_date        : Tue Apr 29 13:14:31 EDT 2008
> xend_config_format     : 2
> 
> dom1: RHEL 5.2 (2.6.18-53.1.14.el5xen #1 SMP Tue Feb 19 07:35:46 EST 2008 
> ia64)
> ---------------
> 
> # cat /proc/cpuinfo 
> processor  : 0
> vendor     : Xen/ia64
> arch       : IA-64
> family     : 32
> model      : 0
> revision   : 7
> archrev    : 0
> features   : branchlong, 16-byte atomic ops
> cpu number : 0
> cpu regs   : 4
> cpu MHz    : 1594.000895
> itc MHz    : 399.222286
> BogoMIPS   : 3006.46
> siblings   : 1
> 
> processor  : 1
> vendor     : Xen/ia64
> arch       : IA-64
> family     : 32
> model      : 0
> revision   : 7
> archrev    : 0
> features   : branchlong, 16-byte atomic ops
> cpu number : 0
> cpu regs   : 4
> cpu MHz    : 1594.000895
> itc MHz    : 399.222286
> BogoMIPS   : 3178.49
> siblings   : 1
> 
> CPUID0: 0x756E6547
> CPUID1: 0x6C65746E
> CPUID2: 0x0
> CPUID3: 0x20000704
> CPUID4: 0x5
> 
> 
-- 
Alex Williamson                             HP Open Source & Linux Org.


_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.