Solaris: bus error when sorting cell array

Moritz Borgmann octave at moriborg.de
Sun Dec 2 13:21:43 CST 2007


>On 30-Nov-2007, Moritz Borgmann wrote:
>
>| My newly Sun Studio-compiled octave bombs out with a bus error when
>| sorting a cell array containing strings.
>|
>| Repeat-By:
>| ---------
>|
>| ./run-octave
>| [...]
>| octave:2> c = {'2', '3', '1'};
>| octave:3> sort(c)
>| panic: Bus Error -- stopping myself...
>| attempting to save variables to `octave-core'...
>| save to `octave-core' complete
>| Bus error
>
>I can't duplicate this problem.  Does it only crash for cell arrays of
>character strings, or all sorts?

So far, I've only seen crashes for cell arrays of strings. Other sort 
operations work OK.

Here is the output of a debug session with dbx. I enabled Sun's 
run-time checks (RTCs) to check for double frees etc. Unfortunately, 
I can't make out any obvious bugs in the code.

octave:4> sort(c);
Read from uninitialized (rui) on thread 1:
Attempting to read 4 bytes at address 0xffbebbb8
     which is 528 bytes above the current stack pointer
t at 1 (l at 1) stopped in Array<octave_value>::~Array at line 61 in file "Array.cc"
    61     if (--rep->count <= 0)
(dbx) where
current thread: t at 1
=>[1] Array<octave_value>::~Array(this = 0xffbebbb4), line 61 in "Array.cc"
   [2] ArrayN<octave_value>::~ArrayN(this = 0xffbebbb4), line 77 in "ArrayN.h"
   [3] Cell::~Cell(0xffbebbb4, 0xffbebb94, 0xf35d71a8, 0x0, 0x2b100, 
0xf3584b20), at 0xedc4a06c
   [4] octave_value::octave_value(this = 0xffbebcf0, a = CLASS, is_csl 
= false), line 372 in "ov.cc"
   [5] mx_sort_indexed<octave_value>(m = CLASS, dim = 1, mode = 
ASCENDING), line 233 in "sort.cc"
   [6] Fsort(args = CLASS, nargout = 0), line 1434 in "sort.cc"
   [7] octave_builtin::do_multi_index_op(this = 0x25e7a0, nargout = 0, 
args = CLASS), line 104 in "ov-builtin.cc"
   [8] octave_builtin::subsref(this = 0x25e7a0, type = CLASS, idx = 
CLASS, nargout = 0), line 54 in "ov-builtin.cc"
   [9] octave_value::subsref(this = 0xffbecaa0, type = CLASS, idx = 
CLASS, nargout = 0), line 783 in "ov.cc"
   [10] tree_index_expression::rvalue(this = 0x45ed68, nargout = 0), 
line 352 in "pt-idx.cc"
   [11] tree_statement::eval(this = 0x914cd8, silent = false, nargout 
= 0, in_function_body = false), line 133 in "pt-stmt.cc"
   [12] tree_statement_list::eval(this = 0x4a8e20, silent = false, 
nargout = 0), line 190 in "pt-stmt.cc"
   [13] main_loop(), line 225 in "toplev.cc"
   [14] octave_main(argc = 5, argv = 0xffbecfdc, embedded = 0), line 
835 in "octave.cc"
   [15] main(argc = 5, argv = 0xffbecfdc), line 35 in "main.c"
(dbx) list
    61     if (--rep->count <= 0)
    62       delete rep;
    63  
    64     delete [] idx;
    65   }
    66  
    67   template <class T>
    68   Array<T>
    69   Array<T>::squeeze (void) const
    70   {
(dbx) pp *this
*this = {
     rep        = 0xffffffff
     dimensions = {
         rep = 0x1
     }
     idx        = 0x30
     idx_count  = -904739736
}
(dbx) next
Misaligned read (mar) on thread 1:
Attempting to read 4 bytes at address 0x7 in page zero
t at 1 (l at 1) stopped in Array<octave_value>::~Array at line 61 in file "Array.cc"
    61     if (--rep->count <= 0)
(dbx) c
t at 1 (l at 1) signal SEGV (no mapping at the fault address) in 
__rtc_trap_handler at 0xf3544bf4
0xf3544bf4: __rtc_trap_handler+0x0058:  ld       [%l5], %l0
Current function is Array<octave_value>::~Array
    61     if (--rep->count <= 0)


So we crashed.... Now going up in the stack to check what 
octave_value::octave_value was doing...

[...]
(dbx) up
Current function is octave_value::octave_value
   372   }
(dbx) pp is_csl
is_csl = false
(dbx) pp -r a
a = {
     ArrayN<octave_value>::Array<octave_value>::rep        = 0x3dc128
     ArrayN<octave_value>::Array<octave_value>::dimensions = {
         ArrayN<octave_value>::Array<octave_value>::dim_vector::rep = 0x914d28
     }
     ArrayN<octave_value>::Array<octave_value>::idx        = (nil)
     ArrayN<octave_value>::Array<octave_value>::idx_count  = 0
}
(dbx) pp *this
*this = {
     rep       = 0x205558
     allocator = class octave_allocator /* STATIC CLASS */
}

so apparently, we enter the destructor with garbage in rep. I have no idea why.

To facilitate debugging, I replaced the member initialization list in 
octave_value::octave_value by explicit assignments. Sure, this is not 
the same, but it should work in principle.

So, in ov.cc, I turned

octave_value::octave_value (const ArrayN<octave_value>& a, bool is_csl)
   : rep (is_csl
	 ? dynamic_cast<octave_base_value *> (new octave_cs_list (Cell (a)))
	 : dynamic_cast<octave_base_value *> (new octave_cell (Cell (a))))
{
}

into

octave_value::octave_value (const ArrayN<octave_value>& a, bool is_csl)
{
   Cell c(a);
       
   if(is_csl)
   {
     octave_cs_list* oc;
    
     oc = new octave_cs_list (c);
     rep = dynamic_cast<octave_base_value *> (oc);
   }
   else
   {
     octave_cell* oc;
    
     oc = new octave_cell (c);
     rep = dynamic_cast<octave_base_value *> (oc);  
   }
}

and guess what? The new code executes fine without bombing out...

Do you have some ideas what could be going on here? I just hope we're 
not chasing a compiler bug here...

Thanks,

Moritz


More information about the Bug-octave mailing list