suspected csvread bug
Julian Briggs
j.briggs at phonecoop.coop
Fri Apr 11 04:39:09 CDT 2008
David Bateman wrote:
> Julian Briggs wrote:
>> Dear Maintainer(s) of Octave package io,
>>
>> I find cvsread mishandles commas embedded in text data, such as headings.
>> This occurs even when I skip the columns/rows containing such headings.
>> Presumably the problem is in dlmread.
>>
>> Here is a demonstration of the issue.
>> Reading file, "csvread_demo2.csv" with content (saved as cvs from Excel
>> spreadsheet):
>>
>> h11,h12,h13,h14
>> h21,1,2,3
>> "h31,c",4,5,6
>> h41,7,8,9
>> h51,10,11,12
>>
>> thus:
>>
>> path_sup = strcat( Templates, "csvread_demo2.csv" )
>> disp("\nMishandles ebedded comma in matrix row 2, col 1)")
>> disp("Reading with: csvread( path_sup, 1, 1)")
>> sup = csvread( path_sup, 1, 1);
>> disp("size:"), disp(size(sup))
>> disp("sup:"), disp(sup);
>>
>> emits:
>>
>> Mishandles ebedded comma in matrix row 2, col 1
>> Reading with: csvread( path_sup, 1, 1)
>> size:
>> 4 4
>> sup:
>> 1 2 3 0
>> 0 4 5 6
>> 7 8 9 0
>> 10 11 12 0
>>
>>> Exit code: 0
>>>
>> In the above cvsread appears to have read "h31,c" as 2 elements.
>>
>>
>> My details:
>> pkg list
>> Package Name | Version | Installation directory
>> --------------+---------+-----------------------
>> io *| 1.0.5 |
>> C:\ProgramFiles\Octave\share\octave\packages\io-1.0.5
>> version
>> ans = 3.0.0
>> Running on Windows XP (I'd prefer Ubuntu Linux).
>>
>> I am using Octave in university research project to apply (economics)
>> input-output analysis to carbon footprinting. I am keen to use Octave so a
>> timely fix would be much appreciated.
>>
>> Comments, workarounds and fixes welcome.
>>
>> Thanks
>>
>> Julian
>>
> Hey it appears that matlab can't read this file at all.. With
> Matlab2007b I get
>
> x = csvread('test.csv')
> ??? Error using ==> textscan
> Mismatch between file and format string.
> Trouble reading number from file (row 1, field 1) ==> h11,h
>
> Error in ==> csvread at 52
> m=dlmread(filename, ',', r, c);
>
> With Octave 3.0 + octave-forge or Octave 3.1.x I get
>
> x = csvread("test.csv")
> x =
>
> 0 0 0 0 0
> 0 1 2 3 0
> 0 0 4 5 6
> 0 7 8 9 0
> 0 10 11 12 0
>
> Yes it is ignoring the quotes in reading the comma, though I don't think
> this is a reasonable file format to expect csvread to accept.
>
> D.
>
>
>
Dear David,
Thanks for your prompt response.
A more useful comparison for me would be to test whether Matlab can
correctly read the above test file, skipping the text header
rows/columns with:
csvread(test.csv, 1,1);
(I do not have access to Matlab just now so cannot test this myself.)
Would you be willing to test this?
(Also Matlab provides the functionality we need in xlsread:
http://www.mathworks.com/access/helpdesk/help/techdoc/matlab.html
which (if I understand the docs correctly) can skip text header
rows/columns either detecting non-numeric rows/columns or by user
specified range.)
I'm keen to persuade my colleagues that Octave is a viable alternative
to Matlab for our project and a resolution of this issue would help.
Thanks
Julian
--
Julian Briggs
220 Stannington View Road, Sheffield S10 1ST
p: 0114-266-3500
m: 07946-33-88-90 mob
e: j.briggs at phonecoop.coop
w: homepages.phonecoop.coop/julianbriggs
More information about the Bug-octave
mailing list