NaN slowdown with some processors
Ben Abbott
bpabbott at mac.com
Wed Jun 4 06:34:37 CDT 2008
On Jun 4, 2008, at 6:55 AM, Olli Saarela wrote:
>> I'm planning to buy a new desktop machine, and since my computations
>> utilize NaN values heavily, I'd like to know whether Intel Core 2
>> processors suffer from the same slowdown with NaN values as
>> Pentium. For
>> details, see http://www.cygnus-software.com/papers/
>> x86andinfinity.html
>
> Thank you all, the data you have provided has clarified the issue. In
> addition to the replies posted to the list, I got some mail showing
> 100x
> slowdown with Core 2 / Debian / Octave 3.0.1. It looks like there
> still
> is a NaN related slowdown in Core 2 when the computation isn't carried
> out using SSE2/3.
>
> If I have understood correctly, gcc can be forced to generate SSEn
> instructions, which avoids this performance degradation completely.
> There also seems to be a number of Linux installations of Octave out
> there that would benefit from such compile options.
>
> The situation is slightly different with MSVC. The documentation on
> MSDN
> says
>
> The optimizer will choose when and how to make use of the SSE and
> SSE2
> instructions when /arch is specified. SSE and SSE2 instructions will
> be used for some scalar floating-point computations, when it is
> determined that it is faster to use the SSE/SSE2 instructions and
> registers rather than the x87 floating-point register stack. As a
> result, your code will actually use a mixture of both x87 and SSE/
> SSE2
> for floating-point computations.
>
> This might explain the NaN-related slowdown on Windows machines with
> Intel processors. Drawing (extrapolating) conclusions from the posted
> figures, MSVC2008&SSE3 seem to do a much better job in this respect
> than
> MSVC2005&SSE2, even though some performance degradation still remains.
>
> Thank you all once again!
> Olli
Just for reference, on my 2.4 GHz Intel Core 2 Duo based Mac running
10.5.2
I'm using Apple's VecLib and my gcc is ..
gcc --v
Using built-in specs.
Target: i686-apple-darwin9
Configured with: /var/tmp/gcc/gcc-5465~16/src/configure --disable-
checking -enable-werror --prefix=/usr --mandir=/share/man --enable-
languages=c,objc,c++,obj-c++ --program-transform-name=/^[cg][^.-]*$/s/
$/-4.0/ --with-gxx-include-dir=/include/c++/4.0.0 --with-slibdir=/usr/
lib --build=i686-apple-darwin9 --with-arch=apple --with-tune=generic --
host=i686-apple-darwin9 --target=i686-apple-darwin9
Thread model: posix
gcc version 4.0.1 (Apple Inc. build 5465)
octave:100> a=zeros(300,300);tic;b=(1.0+a)*a;toc
Elapsed time is 0.003323 seconds.
octave:101> a=zeros(300,300);tic;b=(NaN+a)*a;toc
Elapsed time is 0.003806 seconds.
Ben
More information about the Help-octave
mailing list