Changing octave to exploit multi-core hardware
John W. Eaton
jwe at bevo.che.wisc.edu
Tue Mar 25 09:45:40 CDT 2008
On 25-Mar-2008, Quentin Spencer wrote:
| What about multithreading the mapper functions? I guess we're relying on
| external libraries to actually perform the computations of many of the
| functions, but I would assume (correct me if I'm wrong) that octave does
| the looping through the individual elements of an array. It would seem
| that it would be very straightforward to make the computation of
| something like cos([1:1000]) faster by just splitting up portions of
| large arrays and sending them to separate processors.
OK. In the current sources (not 3.0 or the release-3-0-x branch) the
loop for all single-argument mapper functions is
template <class U, class F>
Array<U>
map (F fcn) const
{
octave_idx_type len = length ();
const T *m = data ();
Array<U> result (dims ());
U *p = result.fortran_vec ();
for (octave_idx_type i = 0; i < len; i++)
{
OCTAVE_QUIT;
p[i] = fcn (m[i]);
}
return result;
}
(this is in liboctave/Array.h).
What would need to happen to send this computation to multiple
processors? Is it necessary to add special code to enable parallel
execution of this loop on systems that can do that? If so I'm not
sure desireable to do that for this loop and every other one like it
where we could maybe benefit from embarrassingly simple parallelism.
Cluttering Octave's sources with special code for this just seems
silly to me. Shouldn't that sort of detail be handled by the compiler
and and OS kernel automatically?
jwe
More information about the Octave-maintainers
mailing list