What is the historical origins of adding trailing underscores to Fortran symbols

DataScientist · ‎04-03-2020

Why do Fortran compilers, by convention, add the trailing underscores to Fortran symbols? What is the historical context? Which compilers follow this convention? Is this a reliable lasting convention to use in software distributions? or is the iso_c_binding module the preferred approach?

https://docs.oracle.com/cd/E19957-01/805-4940/z400091044a7/index.html

Steve_Lionel · ‎04-03-2020

This is a UNIX convention and not consistent. It started with g77. Don't rely on it - use BIND(C). ISO_C_BINDING is just a bunch of declarations.

mecej4 · ‎04-03-2020

See section 4.1 of the manual for the Feldman and Weinberger f77 compiler for Unix, available in PDF form at https://wolfram.schneider.org/bsd/7thEdManVol2/f77/f77.pdf .

DataScientist · ‎04-03-2020

both comments very helpful and informative. Thank you!

DataScientist · ‎04-04-2020

so, here is a problem encountered upon adding bind(c) to the subroutines.

1. with this attribute, all procedure arguments must also become bind(c), otherwise, ifort does not compile.

2. even worse, gfortran does not even allow assumed-length character dummy-argument character(*), intent(in) :: string.

Error: Character argument 'string' at (1) must be length 1 because procedure 'foo' is BIND(C)

although Intel does not complain at all.

So which one is right? ifort or gfortran? and if this is a bug in Intel, then this whole business of adding bind(C) seems to create more problems than the simple problem of dealing with a trailing underscore in the subroutine's name (foo_).

This seems a bit disappointing as it implies that a standardized portable procedure name can only be achieved if the developer and user both agree to make all of their procedures C-interoperable (to some extent, if not fully). and this can have cascading effects on many other interfaces that would exist in the code, all of them requiring bind(C) attribute and compliance (unless the developer/user is willing to sacrifice some performance in exchange for a wrapper that would break this chain of bind(C) attribute requirement on the procedures).

mecej4 · ‎04-04-2020

There is an example of passing a string to a C routine in the Fortran 2008 standard, Note 15.22. Here is the code, ready to build and run, with Ifort or GNU compilers. The Fortran program:

Program PassString2C
   use, intrinsic :: iso_c_binding, only: c_char, c_null_char
   interface
      subroutine copy(in, out) bind(c, name = 'copy')
         import c_char
         character(kind=c_char), dimension(*) :: in, out
      end subroutine copy
   end interface
   character(len=10, kind=c_char) :: digit_string = &
      c_char_'123456789' // c_null_char
   character(kind=c_char) :: digit_arr(10)
   call copy(digit_string, digit_arr)
   print '(1x, 9a1)', digit_arr(1:9)
end Program

The C routine:

void copy(char in[], char out[]){
char *p = in, *q = out;
while (*p) *q++ = *p++; 
*q = '\0';
}

Note that the C routine relies on null termination, and that the Fortran caller provides that termination. The caller has allocated the character variables with sufficient lengths to accommodate the null terminators.

Steve_Lionel · ‎04-04-2020

In Fortran 2003 and 2008, CHARACTER(*) was not "interoperable". Fortran 2018 added the ability to have a CHARACTER(*) argument in an interoperable (BIND(C)) routine, but the other end has to supply or receive what Fortran calls a "C descriptor" for it. Intel Fortran has supported this since version 16, I think gfortran added it only very recently.

The catch is that if you are not aware of the need for a C descriptor on the C side, you'll not understand the behavior you get.

Here's an example I have around that shows use of C descriptors for character variables, with the added twist of it being deferred-length allocatable, but you'll get the idea.

#include "ISO_Fortran_binding.h"
#include <memory.h>
#include <stdio.h>

extern "C" void greetings(CFI_cdesc_t * descr);

int main()
{
	int status;
	CFI_CDESC_T(0) cdesc;

	// Create our own local descriptor for an allocatable string
	status = CFI_establish((CFI_cdesc_t *)&cdesc, NULL,
                        CFI_attribute_allocatable,
 		                CFI_type_char, 1, 0, NULL);
	//Allocate the string to length 7
	status = CFI_allocate((CFI_cdesc_t *)&cdesc, NULL, NULL, 7);
	// Copy in 'Hello, '
	memcpy(cdesc.base_addr, "Hello, ", 7);
	// Call Fortran to append to the string and print it
	greetings((CFI_cdesc_t *)&cdesc);
	printf("Length is now %zd\n", cdesc.elem_len);
	status = CFI_deallocate((CFI_cdesc_t *)&cdesc);
}

subroutine greetings (string) bind(C)
    implicit none
    character(:), allocatable :: string
    
    string = string // 'Zurich!'
    print *, string
    end subroutine greetings

DataScientist · ‎04-04-2020

Steve, Mecej, These are great solutions. Thank you. Regarding my other point, suppose the code is not meant to be mixed in any way with the C language. For example, we would like to generate a DLL to be used within Fortran language but with codes compiled via other compilers than what was used for the DLL. Since the module files are compiled differently between compilers, the DLL symbols will be all mangled based on the rules of that specific version of the compiler used to generate the DLL (and hence useless by any code compiled with any other compiler or compiler version). If we add the bind(C) attribute, then the requirements of F2018 C-interoperability will be imposed. For example, I expect Coarrays dummy arguments would be illegal with the bind(C) attribute, although we have no intention of using this code outside Fortran.

So, is there a way to bypass this limitation of bind(C)? that is, to generate a global name without imposing the C-interoperability rules?

IanH · ‎04-04-2020

Different compilers on the same platform differ in how they implement language features. The differences are such that the implementations are often incompatible. Differences in the name mangling when generating symbols for the linker is just one (relatively superficial) aspect, there are many more.

You might be able to get away with mixing object code from different compilers if you are really restrictive in what gets passed from the code from the primary compiler to the secondary, and perhaps really restrict what you do inside the code from the secondary. Those restrictions will be more severe than restrictions on BIND(C) procedure interfaces (more severe, because some implementations need to change the way they pass arguments for BIND(C) procedures in order to be consistent with the BIND(C) rules).

Assuming that the calling convention (symbol naming, which arguments go into which registers, who manages ith the stack) is fundamentally compatible, consider that there will be still differences in I/O implementation, memory management, array descriptors, polymorphic and parameterised type implementation, coarray implementation, error handling...

These problems can emerge even when using the same compiler, but with different compile options.

Steve_Lionel · ‎04-04-2020

When you say BIND(C), the compiler MUST downcase the name and then apply whatever "decoration" the "companion C processor" would use. For example, on 32-bit Windows that means a leading underscore, but on 64-bit windows, no underscore. You can use the NAME= specifier on BIND(C) to give the exact spelling you want, but the decoration still happens if C would add it.

Generally you can't mix code from different Fortran compilers in an application, though if you isolate them to DLLs it may work. However, calling from one Fortran to another is not likely to be successful except in trivial cases.

The whole point of the C interop features was to eliminate the need for compiler-specific naming and argument passing when working with other languages.

FortranFan · ‎04-04-2020

A. King wrote:
so, here is a problem encountered upon adding bind(c) to the subroutines.
1. with this attribute, all procedure arguments must also become bind(c), otherwise, ifort does not compile.
..
So which one is right? ifort or gfortran? ..

Re: "all procedure arguments must also become bind(c)," that's not necessarily the case: type(c_funptr) option can be of help in such situations.

Also, gfortran is increasingly falling behind in terms of support toward the current Fortran standard, especially with enhanced interoperability with C though there's some advancement starting version 9. However one has to contend with increasing log of pending bug fixed in gfortran.

FortranFan · ‎04-04-2020

A. King wrote:
so, here is a problem encountered upon adding bind(c) to the subroutines.
1. with this attribute, all procedure arguments must also become bind(c), otherwise, ifort does not compile.
..
So which one is right? ifort or gfortran? ..

Re: "all procedure arguments must also become bind(c)," that's not necessarily the case: type(c_funptr) option can be of help in such situations.

Also, gfortran is increasingly falling behind in terms of support toward the current Fortran standard, especially with enhanced interoperability with C though there's some advancement starting version 9. However one has to contend with increasing log of pending bug fixed in gfortran.