Solved: Re: Common terms such as "unix" and "linux" should not be predefined macros

ur · ‎11-27-2022

On multiple occasions the predefined macros "linux" and "unix" have caused porting problems.

Such common terms without an underscore prefix and/or suffix have caused inadvertent substitution on multiple occasions. I was hoping ifx in particular would not carry this forward, but apparently it does; including "unix" which does not appear in either the ifort or ifx man-pages as

a predefined macro.

At least in ifx going forward I think these should be removed. If not removed "unix" should be documented at a minimum.

Usually, it causes compilation failures, and new users in particular are confused. Turning off macro substitution, adding #undef, or -U can all work around it, but I have seen it cause issues multiple times, particularly when build utilities/IDEs run all files through the preprocessor regardless of file suffix.

Perhaps a warning to switch to __linux and __unix could be produced if it appears in a preprocessor directive would be useful, but such common terms should never be predefined, particularly in the case of ifx which gives an opportunity to quit carrying this forward.

Ron_Green · ‎11-28-2022

It does surprise me to see these without underscores. I've looked at dryrun output for years and these never caught my eye. And these 2 undecorated macros are only seen on Linux. Both macOS and WIndows have OS macros defined but with appropriate decoration. This must be some legacy thing, especially the "unix" macro which I would guess goes back to DEC days. I'll bring this up at staff and see if anyone can remember what these are used for, if at all. If they're not being used I'll see if we can remove them in a future update.

View solution in original post

Steve_Lionel · ‎11-27-2022

These are artifacts of what some compilers did in the past and still might be used in some codes. I agree that they can pose problems if running non-Fortran code through the preprocessor.

ur · ‎11-27-2022

The problems were with Fortran code; although with preprocessor directives in it. Things like

subroutine(a,b,unix)

If they were only defined by default when macro expansion is off they would be far less troublesome. I have not seen and do not see much problem with "#ifdef unix" but expansion of the code is the real problem.

In the past (and now as well) preprocessors for Fortran often did not do macro expansion; including a good number of "fpp" flavors. It is little problem with those processors.

Steve_Lionel · ‎11-27-2022

My experience over 40+ years is that preprocessors always did macro expansion, but sure, some might not. As it happens, the Fortran standards committee is considering standardizing preprocessing in a future (post-2023) revision. I've added a note to the discussion that any predefined symbols must start with at least one underscore.

ur · ‎11-27-2022

Yes, I remember when you were just a kid at Digital. I assure you many Fortran preprocessors did not, as I wrote or contributed to several. Note that although the open-source implementation of the last one proposed by the Fortran committee included macro expansion the proposal did not.

The underscore is a good recommendation, and an (incomplete) listing of preprocessor variables shows that is common practice. See

https://fortranwiki.org/fortran/show/Predefined+preprocessor+macros

There was an extended comparison done of the current preprocessors and they all vary in some not so-subtle ways that almost all revolve around macro expansion. I thought it was in the Fortran Discourse forum or the fpm discussions but I do not immediately see it.

Supporting macro expansion is biggest decision in what the preprocessor can be used for but causes almost all the issues and complexities. Do you just expand unconditionally, or provide automatic line continuation and break at 72 and 132. Do you expand

in comments or not? Some people want the expansion, so do not. Comments are frequently used in automatically generated documentation by utilities like ford(1) and doxygen(1) so it matters more than just the obvious issues. Should the expansion be case-sensitive when Fortran is not? A frequent complaint about cpp(1) by Fortran programmers is that they WANT expansion in strings,

which cpp (by far the most common model or tool for Fortran preprocessing) does not do. On and on ... . The second biggest issue is, perhaps surprisingly, comments. Then comes line numbers for the debugger; although that is standardized enough now to not be much of an issue. Fortran lacks block comments, so /* */ is often supported like in K&R C; which also presents issues; and since there is no associated tool for extracting comment blocks for auto-generating documentation the various auxiliary tools such as ford(1) and doxygen(1) come up with incompatible comment styles to denote comments intended for external documentation.

It would be interesting to see what percentage of Fortran users use macro expansion versus the rest of cpp-like functionality; as it complicates implementation considerably to allow for it.

That being said I strongly support a standardized preprocessor. Long overdue. I personally try to write code that does not require it, sometimes resorting to tricks other than preprocessors but just when I think I do not need it anymore a need arises. My major fear is it would become too popular and Fortran codes would start looking like C files with preprocessor directives obfuscating everything.

Good luck. I know I find a standardized preprocessor very useful (I use my own). Even a simple one standardized as part of the Fortran specifications would be terrific.

Steve_Lionel · ‎11-28-2022

Fortran once had a standardized preprocessor, known as COCO. It was an optional part of the standard, didn't look anything like cpp, no vendor implemented it that I know of, and was roundly ignored. It got dropped from a subsequent revision. Our goal this time is to standardize existing practice in a Fortran-friendly way. My specific goal is that most users can just use the standard form without changing their code.

ur · ‎11-28-2022

http://www.daniellnagle.com/coco.html

Familiar with it. Used it. Nothing wrong with it and it has an OpenSource implementation in Fortran, which you would have thought would have made it very popular in the Fortran community. I think you are on the right track as the only thing wrong with COCO was it was not cpp-like. Heck, m4 has been around "forever" and is far more powerful and I have seen it used only a handful of times on Fortran. I really dislike several things about cpp, especially everything being defined as strings and only being evaluated in a # directive, at least in past versions trouble with // and /* and case sensitivity, which caused problems when Fortran was run through various tools and "beautifiers" that changed case and messed up macro expansions and so on but it and similar fpp tools are the de-facto standard and sounds like you are taking that into consideration. Is the proposal far enough along that a prototype would be useful? I know Fortran templating is in-process but a looping capability would be handy for generating generic routines that vary only in type characteristics with array variables would be a nice feature that might encourage switching, along with being able to expand macros inside quoted strings.

Ron_Green · ‎11-28-2022

It does surprise me to see these without underscores. I've looked at dryrun output for years and these never caught my eye. And these 2 undecorated macros are only seen on Linux. Both macOS and WIndows have OS macros defined but with appropriate decoration. This must be some legacy thing, especially the "unix" macro which I would guess goes back to DEC days. I'll bring this up at staff and see if anyone can remember what these are used for, if at all. If they're not being used I'll see if we can remove them in a future update.

ur · ‎11-28-2022

That would be great. "i386" is in there on some platforms too, I believe. Someone else complained about that one to me; but that is far less likely to be in a recent code in particular, and less commonly occurring; but sure would drive someone crazy with an INTEGER value of that name if it ever happened. As Lionel suggested, everything predefined should have a "_" or "__" prefix.

Since "unix" is not even documented it would seem like it could go. Although it puzzles me that at least in the ifort(1) man-page __FILE__, __DATE__, __TIME__, and __LINE__ do not appear, which are my favorite ifort macros. Definitely keep those. Although I guess those are not really normal macros, as they change in the file as it is processed as appropriate.

Those would be really nice to be standardized in any future standard preprocessor.

Steve_Lionel · ‎11-28-2022

See also Why is the word "unix" a predefined pre-processor macro in ifort 2021.7 - Intel Communities

ur · ‎11-28-2022

Missed that. Confirmation though.

jimdempseyatthecove · ‎11-29-2022

While you are at it ("standard" fpp) have compiler option and fpp option that emits all the pre-defined macros with values and/or description (e.g. __LINE__ should report "Text value of source line number" or something like this).

Perhaps -verbose could be considered (similar to OpenMP environment variable).

Jim Dempsey

ur · ‎11-29-2022

I have found it useful to actually have a directive that outputs the macro definitions as comments, so a

#show prints a table of all the macros; and a "#show namea nameb name*" does the same with just the listed names, but allows simple globbing. Simple and sometimes useful for distinguishing between output files when actually retaining the intermediate files. All the output is written to the output file with lines starting with ! (not standard, but even a lot of pre-f90 compilers allowed ! as a comment) so the output is still valid Fortran.

An impressive feature of COCO was that it could be reversible. It can keep all the directives encountered as comments so you can recover the unprocessed original file from the post-processed file. On the one hand I thought that was a really clever feature, but in practice (I used COCO for a period of time; but found if I wanted to share files with a large audience that others just wanted cpp directives, as COCO was never supported by the various compilers) I really did not use it, as I generally used the output as an intermediate file that was removed after compilation.

Assuming cpp/fpp is the model, other supplemental directives not found in those are

#import VARNAME

which imports an environment variable as if it had been defined with -DVARNAME=VALUE

#OUTPUT filename [--append]

which makes it easy to have a single file that outputs Fortran, C, markdown ... sections but would complicate a preprocessor being "inline" in the compiler, which I hope is an expected feature of a standard preprocessor, thus being able to eliminate having to generate (or at least retain) intermediate files, being able to define reusable blocks of plain text and reuse them or loop over them applying an optional filter that can convert them all to comments, convert them all to a character variable definition, or convert them to WRITE statements. Makes maintaining

comments and help text a lot easier, as you can just type it as plain text, for example. I would be content with just cpp-like functionality, but those are features I use a lot cpp(1) does not do, except typically (not in all fpp flavors) block comments are supported. I think a #import would be useful and simple though. Perhaps a group concensus would be it is problematic, making sure it is not inadvertently the wrong value, ... so not sure even that would make it into a first-generation standard utility.

I suspect there will be interesting discussions about what file suffixes should be processed (.f90 versus .F90 or new ones like .ff and .FF or none and you have to indicate with command switches, ...). Fortran always distances itself from those kind of system-dependent issues but I would like to see something emerge where I can say "f90 a.f b.F90 c.f90 d.F" and have it work.

Steve_Lionel · ‎11-29-2022

Thanks - I have recorded these comments for use in the committee's discussions. As for file suffixes, etc., I think at best it could be a Note to the effect of "If the processor supports source file name suffixes that can be represented as upper or lower case, it is recommended that, in the absence of processor options specifying otherwise, source files with suffixes beginning with the capital letter F be preprocessed by default, while others not be preprocessed by default."