strange FFT timing
David Bateman
David.Bateman at motorola.com
Mon Apr 21 03:15:22 CDT 2008
Sergei Steshenko wrote:
> 1e6 is faster than 2^20.
>
> FFTW3 is with SSE2 support.
>
But SSE2 instructions are only used if the block of memory passed to
FFTW3 of type fftw_complex is 16 byte aligned. If its not then FFTW3
falls back to a slower non SSE2 version... Octave attempts to flag
memory blocks that have the correctly alignment for use of the SSE2
instruction and force FFTW3 to use it. However Octave makes no attempt
to enforce a 16byte alignment in its own Array class and so SSE2 can not
always be used with Octave.. This is what I meant by "got lucky"..
D.
--
David Bateman David.Bateman at motorola.com
Motorola Labs - Paris +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin +33 6 72 01 06 33 (Mob)
91193 Gif-Sur-Yvette FRANCE +33 1 69 35 77 01 (Fax)
The information contained in this communication has been classified as:
[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary
More information about the Help-octave
mailing list