suspected csvread bug

Julian Briggs j.briggs at phonecoop.coop
Fri Apr 11 04:39:09 CDT 2008



David Bateman wrote:
> Julian Briggs wrote:
>> Dear Maintainer(s) of Octave package io,
>>
>> I find cvsread mishandles commas embedded in text data, such as headings.
>> This occurs even when I skip the columns/rows containing such headings.
>> Presumably the problem is in dlmread.
>>
>> Here is a demonstration of the issue.
>> Reading file, "csvread_demo2.csv" with content (saved as cvs from Excel
>> spreadsheet):
>>
>> h11,h12,h13,h14
>> h21,1,2,3
>> "h31,c",4,5,6
>> h41,7,8,9
>> h51,10,11,12
>>
>> thus:
>>
>> path_sup     = strcat( Templates, "csvread_demo2.csv" ) 
>> disp("\nMishandles ebedded comma in matrix row 2, col 1)")
>> disp("Reading with: csvread( path_sup, 1, 1)")
>> sup = csvread( path_sup, 1, 1);
>> disp("size:"), disp(size(sup))
>> disp("sup:"), disp(sup);
>>
>> emits:
>>
>> Mishandles ebedded comma in matrix row 2, col 1
>> Reading with: csvread( path_sup, 1, 1)
>> size:
>>    4   4
>> sup:
>>     1    2    3    0
>>     0    4    5    6
>>     7    8    9    0
>>    10   11   12    0
>>   
>>> Exit code: 0
>>>     
>> In the above cvsread appears to have read "h31,c" as 2 elements.
>>
>>
>> My details: 
>> pkg list
>> Package Name  | Version | Installation directory
>> --------------+---------+-----------------------
>>           io *|   1.0.5 |
>> C:\ProgramFiles\Octave\share\octave\packages\io-1.0.5
>> version
>> ans = 3.0.0
>> Running on Windows XP (I'd prefer Ubuntu Linux).
>>
>> I am using Octave in  university research project to apply (economics)
>> input-output analysis to carbon footprinting.  I am keen to use Octave so a
>> timely fix would be much appreciated.
>>
>> Comments, workarounds and fixes welcome.
>>
>> Thanks
>>
>> Julian
>>   
> Hey it appears that matlab can't read this file at all.. With
> Matlab2007b I get
> 
>  x = csvread('test.csv')
> ??? Error using ==> textscan
> Mismatch between file and format string.
> Trouble reading number from file (row 1, field 1) ==> h11,h
> 
> Error in ==> csvread at 52
>     m=dlmread(filename, ',', r, c);
> 
> With Octave 3.0 + octave-forge or Octave 3.1.x I get
> 
>  x = csvread("test.csv")
> x =
> 
>     0    0    0    0    0
>     0    1    2    3    0
>     0    0    4    5    6
>     0    7    8    9    0
>     0   10   11   12    0
> 
> Yes it is ignoring the quotes in reading the comma, though I don't think
> this is a reasonable file format to expect csvread to accept.
> 
> D.
> 
> 
> 
Dear David,

Thanks for your prompt response.

A more useful comparison for me would be to test whether Matlab can 
correctly read the above test file, skipping the text header 
rows/columns with:
csvread(test.csv, 1,1);
(I do not have access to Matlab just now so cannot test this myself.)
Would you be willing to test this?

(Also Matlab provides the functionality we need in xlsread:
http://www.mathworks.com/access/helpdesk/help/techdoc/matlab.html
which (if I understand the docs correctly) can skip text header 
rows/columns either detecting non-numeric rows/columns or by user 
specified range.)

I'm keen to persuade my colleagues that Octave is a viable alternative 
to Matlab for our project and a resolution of this issue would help.

Thanks

Julian
-- 
Julian Briggs
220 Stannington View Road, Sheffield S10 1ST
p: 0114-266-3500
m: 07946-33-88-90 mob
e: j.briggs at phonecoop.coop
w: homepages.phonecoop.coop/julianbriggs


More information about the Bug-octave mailing list