How to extract column data from a csv file in vectors (2024)

dpb on 18 Sep 2014

  • Link

    Direct link to this answer

    https://physical-modeling.mathworks.com/matlabcentral/answers/155299-how-to-extract-column-data-from-a-csv-file-in-vectors#answer_152128

  • Link

    Direct link to this answer

    https://physical-modeling.mathworks.com/matlabcentral/answers/155299-how-to-extract-column-data-from-a-csv-file-in-vectors#answer_152128

Edited: dpb on 18 Sep 2014

Open in MATLAB Online

Don't say which columns are of interest, but you'll have much more success if you use format string for the types of data by column and match the number of fields to the number of columns in the file...there are a number of empty columns at the right it appears, but ignoring them in the format string causes the extra blank lines you've gotten.

>> fmt=['%s %s %f %f %s' repmat('%f',1,18)];

>> fid=fopen('extrac1985.csv');

>> c=textscan(fid,fmt,'delimiter',',','headerlines',1,'collectoutput',1);

>> whos c

Name Size Bytes Class Attributes

c 1x4 18192 cell

>> fid=fclose(fid);

textscan puts the sets of text and numeric together so the first column cell array is the first two text columns then the numeric id, the text and finally the remaining numeric array.

To select columns in the end, either--

a) read whole as above and then just set unwanted columns to [], or

b) modify the format string and skip unwanted columns by use of the '%*s' or '%*f' as appropriate in the format string to not return unwanted columns.

doc textscan

for more detail; look at the link to 'format options' for the gory details.

ADDENDUM

OK, to skip some columns I had a few minutes...get rid of the last number of empty columns and a few that are identical thruout--

>> fmt=['%s %*s %f %f %*s' ...

repmat('%*f',1,3) repmat('%f',1,10) repmat('%*f',1,5)];

>> c=textscan(fid,fmt,'delimiter',',','headerlines',1,'collectoutput',1);

>> whos c

Name Size Bytes Class Attributes

c 1x2 8600 cell

>> c{1}(1:10)

ans =

'01/01/1985 0:00'

'01/01/1985 1:00'

'01/01/1985 2:00'

'01/01/1985 3:00'

'01/01/1985 4:00'

'01/01/1985 5:00'

'01/01/1985 6:00'

'01/01/1985 7:00'

'01/01/1985 8:00'

'01/01/1985 9:00'

>> c{2}(1:10,:)

ans =

Columns 1 through 9

892702 1 360 9 NaN 17 NaN 9 9

892702 1 360 9 NaN 16 NaN 9 9

892702 1 360 9 NaN 14 NaN 9 9

892702 1 350 9 NaN 15 NaN 9 9

892702 1 360 10 NaN 15 NaN 9 9

892702 1 360 10 NaN 16 NaN 9 9

892702 1 10 9 NaN 16 NaN 9 9

892702 1 360 10 NaN 15 NaN 9 9

892702 1 360 9 NaN 15 NaN 9 9

892702 1 360 9 NaN 16 NaN 9 9

Columns 10 through 12

0 9 0

0 9 0

0 9 0

0 9 0

0 9 0

0 9 0

0 9 0

0 9 0

0 9 0

0 9 0

>>

2 Comments

Show NoneHide None

Katerina F on 19 Sep 2014

Direct link to this comment

https://physical-modeling.mathworks.com/matlabcentral/answers/155299-how-to-extract-column-data-from-a-csv-file-in-vectors#comment_238099

  • Link

    Direct link to this comment

    https://physical-modeling.mathworks.com/matlabcentral/answers/155299-how-to-extract-column-data-from-a-csv-file-in-vectors#comment_238099

Hi, I am interested in keeping the columns A D F H J N. (I do not understand how the selection with the repmat works in this case, that is what you are doing with the line with the repmat, i.e. for example what the 1 and the 5 do in repmat('%*f',1,5)) etc. thanks,k

Note that I do not know the lenght of the data in the files (this file is just an example).

dpb on 19 Sep 2014

Direct link to this comment

https://physical-modeling.mathworks.com/matlabcentral/answers/155299-how-to-extract-column-data-from-a-csv-file-in-vectors#comment_238134

  • Link

    Direct link to this comment

    https://physical-modeling.mathworks.com/matlabcentral/answers/155299-how-to-extract-column-data-from-a-csv-file-in-vectors#comment_238134

Edited: dpb on 19 Sep 2014

Open in MATLAB Online

The repmat is simply replacing the explicit typing of N copies of the string with a counted number..."1,5" in this case is 1-row, 5 columns. See

doc repmat % for details

and/or simply try it at the command line and see what the result is.

...I do not know the length of the data in the files...

You mean rows or columns by "length" in this context? Number of rows is immaterial, the number of columns is more significant as the format string used controls the arrangement in the return cell array--having that mismatch caused all the extra blanks in your sample case you posted.

Mohammad's suggestion with the new table data structure may be convenient if you're using a recent-enough release (R2013+ I think; I am at R2012b here so don't have the full facility to see what it does with all the trailing empty columns).

It will, as far as I know, read the full array but it would be relatively easy as noted to select the columns wanted from that. OTOH, if you know the specific columns wanted, using the '%*' form of the format string for the unwanted columns will remove them from consideration at the git-go. You do need to know the actual number of columns total, however. If worst comes to worst, you can always read in the first line and count commas to ascertain that value.

Sign in to comment.

How to extract column data from a csv file in vectors (2024)

References

Top Articles
Latest Posts
Article information

Author: Catherine Tremblay

Last Updated:

Views: 6298

Rating: 4.7 / 5 (47 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Catherine Tremblay

Birthday: 1999-09-23

Address: Suite 461 73643 Sherril Loaf, Dickinsonland, AZ 47941-2379

Phone: +2678139151039

Job: International Administration Supervisor

Hobby: Dowsing, Snowboarding, Rowing, Beekeeping, Calligraphy, Shooting, Air sports

Introduction: My name is Catherine Tremblay, I am a precious, perfect, tasty, enthusiastic, inexpensive, vast, kind person who loves writing and wants to share my knowledge and understanding with you.