NaN slowdown with some processors

Ben Abbott bpabbott at mac.com
Wed Jun 4 06:34:37 CDT 2008


On Jun 4, 2008, at 6:55 AM, Olli Saarela wrote:

>> I'm planning to buy a new desktop machine, and since my computations
>> utilize NaN values heavily, I'd like to know whether Intel Core 2
>> processors suffer from the same slowdown with NaN values as  
>> Pentium. For
>> details, see http://www.cygnus-software.com/papers/ 
>> x86andinfinity.html
>
> Thank you all, the data you have provided has clarified the issue. In
> addition to the replies posted to the list, I got some mail showing  
> 100x
> slowdown with Core 2 / Debian / Octave 3.0.1. It looks like there  
> still
> is a NaN related slowdown in Core 2 when the computation isn't carried
> out using SSE2/3.
>
> If I have understood correctly, gcc can be forced to generate SSEn
> instructions, which avoids this performance degradation completely.
> There also seems to be a number of Linux installations of Octave out
> there that would benefit from such compile options.
>
> The situation is slightly different with MSVC. The documentation on  
> MSDN
> says
>
>   The optimizer will choose when and how to make use of the SSE and  
> SSE2
>   instructions when /arch is specified. SSE and SSE2 instructions will
>   be used for some scalar floating-point computations, when it is
>   determined that it is faster to use the SSE/SSE2 instructions and
>   registers rather than the x87 floating-point register stack. As a
>   result, your code will actually use a mixture of both x87 and SSE/ 
> SSE2
>   for floating-point computations.
>
> This might explain the NaN-related slowdown on Windows machines with
> Intel processors. Drawing (extrapolating) conclusions from the posted
> figures, MSVC2008&SSE3 seem to do a much better job in this respect  
> than
> MSVC2005&SSE2, even though some performance degradation still remains.
>
> Thank you all once again!
>   Olli


Just for reference, on my 2.4 GHz Intel Core 2 Duo based Mac running  
10.5.2

I'm using Apple's VecLib and my gcc is ..

  gcc --v
Using built-in specs.
Target: i686-apple-darwin9
Configured with: /var/tmp/gcc/gcc-5465~16/src/configure --disable- 
checking -enable-werror --prefix=/usr --mandir=/share/man --enable- 
languages=c,objc,c++,obj-c++ --program-transform-name=/^[cg][^.-]*$/s/ 
$/-4.0/ --with-gxx-include-dir=/include/c++/4.0.0 --with-slibdir=/usr/ 
lib --build=i686-apple-darwin9 --with-arch=apple --with-tune=generic -- 
host=i686-apple-darwin9 --target=i686-apple-darwin9
Thread model: posix
gcc version 4.0.1 (Apple Inc. build 5465)

octave:100> a=zeros(300,300);tic;b=(1.0+a)*a;toc
Elapsed time is 0.003323 seconds.
octave:101> a=zeros(300,300);tic;b=(NaN+a)*a;toc
Elapsed time is 0.003806 seconds.

Ben




More information about the Help-octave mailing list