- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Comma-separated-values are frequently used in my work, as for example Excel (and other modern languages) are very handy at reading them. In Fortran it is not so easy. I try list-directed input, which works "most of the time" but not always. There is trouble when a text value in the input file contains blanks, for example
value-a, value-bcd, value ef, value-ghi
then reading this with
CHARACTER(20) :: str1, str2, str3, str4
READ (5, *) str1, str2, str3, str4
results in
str1 = 'value-a ' [padded with blanks to the declared length of 20]
str2 = 'value-bcd '
str3 = 'value '
str4 = 'ef '
str5 = 'value-ghi'
The result with str3 fails the intent, because the hyphen was accidentally a blank, and blanks are used as value separators for list-directed input (in addition to commas). So, I need an edit descriptor or other formatting method that forces commas to be the ONLY separator. This is easy--indeed automatic--in Excel. My grief is caused because I am handed these files by others who work only with Excel, where they can be careless about inserting blanks in the middle of a string value, and they don't understand the significance of this in Fortran.
Is there a way to reliably read csv files in Fortran that I don't know about?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I read the whole CSV line as one large character string and then use INDEX function to find the first comma, lets call it position IP. The fist item is the sub-string from position 1 to ip-1. This is in a loop so we then search for the next comma etc to give sub-string last_IP+1:IP-1. You need some extra code if the sub-string has quotes as it is an excel string that may have commas in it that you wish to ignore....
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's also possible to tell Excel to enclose string values in quotes, but I gather you don't have control over that. app4619's approach is what I use, but you can be done in by a string that contains a comma. If you can convince the users to give you tab-delimited files, that can work better.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Read the line in, then parse the commas.
module parse integer, parameter :: MAX_LINE = 256 ! you determine the size of line character(MAX_LINE) :: line contains function getArg(n) result(arg) implicit none character(MAX_LINE) :: arg integer :: n,i,j,count j = 0 do count=1,n i = j + 1 j = INDEX(line(i:),',') if(j == 0) exit j = j + i - 1 end do if(j == 0) then if(count == n) then arg = line(i:) else arg = ' ' endif else arg = line(i:j-1) endif end function getArg end module parse program lineParse use parse implicit none ! Variables CHARACTER(20) :: str1, str2, str3, str4, str5 ! Body of lineParse line = 'value-a, value-bcd, value ef, value-ghi' ! simulate read of line str1 = getArg(1) str2 = getArg(2) str3 = getArg(3) str4 = getArg(4) str5 = getArg(5) print *, str1, str2, str3, str4, str5 end program lineParse
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It has always been a challenge for me as to how flexible data input should be.
Once a , delimiter is allowed,
- should trailing spaces be allowed, such as 11, 22 33 14, 5 (list-directed editing) You can get unexpected values with 22 33 44.
- should " string with enclosed spaces or , be allowed" as a single string
- should identifying integer and real fields be available.
- is 1e4 a real or e4 a real
- should simple math be included, as 11.4*23-3
All this can be done, expanding on Jim's approach. I find that mixing spaces and , as a delimiter is usually expected, while "string" is a useful extension.
Simple math looks good and can document conversion factors, but with Excel these are no longer necessary.
A simple test for integer or real is to read the field, with an iostat= although back with F77 the standard did not require a trapped response, which I learnt the hard way when transferring code to IBM Fortran 77. It was interesting to see all the loopholes in the F77 standard.
eg line = 'value-a, value-bcd, value ef,, 1.23, e5, 1e4, "with , inside", 12, 13, 1.2.3, 1 2 3, ' ! simulate read of line
1.2.3 will fail but 1 2 3 reads ok as 123 but is probably not what was meant.
However for my own uses, as long as the data layout is clearly defined, fixed layouts work well, as .prn exports from Excel are quite functional.
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Actually, a simpler change of enclosing value ef in quotes as "value ef" works without any change to your code.
CHARACTER(20) :: str(4) character(256) :: line integer i line = 'value-a, value-bcd, value ef, value-ghi' READ (line, *) str write (*,*) (trim(str(i)),'| ',i=1,4) line = 'value-a, value-bcd, "value ef", value-ghi' READ (line, *) str write (*,*) (trim(str(i)),'| ',i=1,4) end
I'm not sure about the suggestion of switching to tab delimited files is a good idea, as formatted reads have significant problems and I don't know about their general suitability for list-directed reads.
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I usually use the semicolon ; as a separator for csv files in Excel, mainly to avoid the problems caused by the different conventions for decimal point/comma in localized versions of Windows and Excel, but also for commas in text strings. Semicolons appear much less frequently than commas or points in text strings.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Semicolon is a good choice, but it may be difficult in getting external data sources to go to the "effort" of specifying ; as the field separator. This is likely the situation that dboggs is facing.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
nvaneck,
Excellent! Nice generalize solution, at least for the ASCII character set.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks to everyone for the useful suggestions and discussion! A couple of comments:
I am somewhat familiar with parsing techniques and have used them before. But in this case it is a one-off program with short-term life expectancy, so I am highly motivated to stick to simple standard methods without detailed development. The sample code fragments provided are helpful however.
I can easily get the data files tab delimited, but how would that help? How can Fortran read tab delimited data (short of writing my own parsing routine, of course, which I have done before)? If it is read using a simple list-directed method (and I'm not sure that it can), wouldn't this be subject to the same embedded-space limitation as reading comma-separated data?
It seems to me that, as data transfer using this format (comma delimited or tab delimited) is becoming more common, it would behoove the Fortran developers to implement an intrinsic technique to handle this easily. Perhaps a variation of list directed, either input or output, where the desired separators can simply be specified in a list. In my case, for example, the input problems would be solved by specifying a comma as the one and only separator.
Call this a feature request for the next major version.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>I can easily get the data files tab delimited
It would help when the export from excel (save as) used commas: 1,234.56
You will have to make a parser or find one on the web that suits your purpose.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page