slow access to consecutive matrix elements

Francesco Potortì Potorti at isti.cnr.it
Tue Nov 11 03:44:45 CST 2008


>On Sat, Oct 11, 2008 at 10:22 AM, Francesco Potorti`
><Potorti at isti.cnr.it> wrote:
>> This is a real life example that demonstrates how access to matrix
>> elements is not optimised for the case of complete ranges in the first
>> dimensions.  I am sending it here and not to the bug list because this
>> example may be useful to others, at least until this cases is optimised
>> in the sources.
>>
>> # This is a big 5-dim matrix, about 100MB size
>> octave> kk=rand(156,222,1,44,8);
>>
>> # I access a slice of it, which is sequential in memory
>> octave> t=cputime; for ii=1:44, for jj=1:8
>>  mm=kk(:,:,:,ii,jj); endfor, endfor, cputime-t
>> ans =  5.9124
>>
>> # Using a linear index is much faster
>> octave> span=(1:156*222);
>>        t=cputime; for ii=1:44, for jj=1:8
>>  mm=kk(sub2ind(size(kk),1,1,1,ii,jj)+span-1); endfor, endfor, cputime-t
>> ans =  2.6642
>>
>> # Removing the call to sub2ind reaches top speed
>> octave> cp=[1,cumprod(size(kk)(1:end-1))]; span=(1:156*222);
>>        t=cputime; for ii=1:44, for jj=1:8
>>  mm=kk(sum(([1,1,1,ii,jj]-1).*cp)+span); endfor, endfor, cputime-t
>> ans =  1.4001
>>
>> Still, I wish access were faster yet.  Is there a reason why copying
>> consecutive memory is so slow?  I wish I could help with optimising
>> this, even if I am certainly not the most qualified person to do it.
>>
>
>I guess the indexing routines do deserve some attention w.r.t
>performance. Reducing code duplication would also be nice. I have this
>on my TODO list, but I don't think it's a good idea to do it before
>3.2.x is forked, as such changes are, IMO, likely to introduce bugs.

Follwoing up on this, I realised that there is room for further
significant speedup:

# Be careful to not convert ranges into matrices
octave> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222;
      t=cputime; for ii=1:44, for jj=1:8
base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len); endfor, endfor, cputime-t
ans =  0.26402

The fact is, I was discounting the fact that a range remains a range
even after linear transformation, while this does not appear to be the
case:

octave3.1> typeinfo(1:4)
ans = range
octave3.1> typeinfo(4+(1:4))
ans = matrix
octave3.1> typeinfo(4*(1:4))
ans = matrix

>From the depth of my ignorance about Octave's internals, I dare say that
it should not be too difficult to keep ranges as ranges even after sum
or product with a scalar.  Maybe even after sum with a range with the
same number of elements.  Am I wrong?

-- 
Francesco Potortì (ricercatore)        Voice: +39 050 315 3058 (op.2111)
ISTI - Area della ricerca CNR          Fax:   +39 050 315 2040
via G. Moruzzi 1, I-56124 Pisa         Email: Potorti at isti.cnr.it
(entrance 20, 1st floor, room C71)     Web:   http://fly.isti.cnr.it/


More information about the Bug-octave mailing list