Solved: Inter-Module References - Page 2

L__Richard_L_ · ‎03-12-2014

I have modules that reference each other. Is there any way to do this; e.g., pre-compiled modules? My desired code sequence would have the following pattern:

module M1 source. This uses module M1, references subroutine S2, and contains array A1. All objects are public.
module M2 source. This uses module M1, contains subroutine S2, and references array A1. All objects are public.
main program P source. This uses modules M1 and M2 and references subroutine S2 and array A1.

I suspect this will not compile (if I do it in one step) because of the circular references; so, I did not try it. How about the following approach?

Compile M1 source as a separate step.
Compile M2 source as a separate step.
Compile P source with a command that references the pre-compiled forms of M1 and M2.

[[I apologize for earlier not submitting this as a separate thread.]]

Paul_Curtis · ‎03-12-2014

Can't have circular references, and you can't get around this situation by gaming the compilation order. Suggestion: create a module G.f90 (ie, "globals") which contains data type definitions, defined parameters, array A1, and subroutine S2. Then: Main uses (G, M1, M2), and M1 and M2 both use G.

View solution in original post

FortranFan · ‎03-17-2014

Jim Dempsey wrote:

The point of my post was to illustrate how to separate the functions, and not how one would code the specific sample code where Bfoo calls Afoo. I left it for an exercise of the reader to write into Afoo.f90 an additional subroutine Afee, and write into Bfoo.f90 an additional subroutine Bfee that makes the calls the other way (thus completing the other half of the circle).

Has L. Richard L. provided more information privately than is being discussed here? His original post and his subsequent comments don't exactly convey such a complex dependency (in fact, OP only mentioned one procedure S2). In the absence of him explaining his exact situation, it may not be fruitful to suggest a module organization and in fact, it can mislead some of the readers. As app4619 says in his last comment, Richard may be able to rearrange procedures in modules - M1, M2, M3, etc. instead of creating one big module.

Jim Dempsey wrote:

If one were to rely on host association, then all the code, excepting the PROGRAM, would reside in a single module.

No,not at all - host association does NOT imply one big module by any means. One can still have a set of modules that combine procedures together based on some classification/criterion and which USE sets of DATA modules as required. As app4619 mentioned, no rocket science here - just common sense to organize data and methods.

Jim Dempsey wrote:

Interface modules are the only way you get access to libraries (e.g. Win32, MKL, ...), and object-only code distribution. There is nothing inherently wrong with interfaces.

Note, the source code in the library should also use the same interface module. Thus assuring interface consistency.

Yes, for external libraries that do not or cannot provide modules for USE in consuming procedures, it is either INTERFACE statement to achieve explicitness or being implicit - the former is clearly better. However, when one is writing one's own code, then having to maintain two interfaces - one in the INTERFACE statement and another via the procedure itself - is a recipe for trouble as explained by Ian. It requires more discipline and my colleagues and I have been burnt by it.

So if Richard has highly complex dependencies with his procedures, then he has to decide whether it is easier to maintain one big module or use interface blocks and ensure they remain consistent with the procedure declarations.

In the meantime, back to "chomping at the bit" for SUBMODULEs! :-)

Paul_Curtis · ‎03-17-2014

It is frequently the case, at least for my codes, that the overall size and complexity (>100 modules, each with many subs) makes circularities almost impossible to avoid. Although as noted one can't game the dependency checker, I should point out that for Windows programs, circularity issues can be bypassed by using a (custom) windows message (ie, WM_YOURSUBCALL), whereby SendMessage()s with this flag are passed via the opsys into a proc function in some other module and processed there (ie, where there is no circularity) to call the routine. Obviously this technique is pretty stinky and nowhere near standard fortran, but it is undetectable by the compiler and works very well to avoid knotty circularities.

Steven_L_Intel1 · ‎03-17-2014

Paul, my take is that if you have circularities, you haven't partitioned your functionality correctly. Maybe some set of functions needs to be placed in its own module. The hack you propose is like sawing without a blade guard (or eye protection) - you're going to get hurt eventually. Try working with the language rather than subverting it.

jimdempseyatthecove · ‎03-18-2014

Maintaining two interfaces is trivial as compared to maintain all calls to said interface when interface changes. Solving the circular reference through use of interface is often less error prone than sawing your code with or without a guard.

Jim Dempsey

qolin · ‎03-19-2014

Maintaining two interfaces would be enormously easier if the compiler were allowed to check that the interface definition matched the implimentation. For some crazy reason the standard disallows this.

Compare with C++, where function prototypes have to appear in a separate .h file, and that file must be included in the source of the function itself. ...I'm not a fan of C++, but this is one area where it works better.

Steven_L_Intel1 · ‎03-19-2014

What the standard disallows is "interface to self", where in a routine you USE a module that declares the interface to the current routine. This would create semantic complications so it has been rejected by the standards committee. The preferable approach is to NOT have two interfaces but to put everything in modules, though I understand that this can be difficult if you are updating old code.

Are you sure about the C++ method? In all the code I have seen, the prototypes are in file scope (which Fortran doesn't have), but not in the functions themselves. But include files are not the same as modules.

qolin · ‎03-19-2014

OK Yes I mean they have to be in the file.

Indeed include files are not the same as modules. If you try declaring the interfaces in include files, it won't help you in Fortran, because Fortran doesn't have a concept of file scope like C++ does.

So the upshot is, as I say, <rant> there is NO WAY the compiler can check the interface definition against the implimentation, and remain standard-conforming. </rant> Please correct me if I'm wrong...

Steven_L_Intel1 · ‎03-19-2014

Well, yes, you are wrong. Sort of. If you saw an explicit interface in one compilation you could could compare it to the interface that the actual routine ought to have, if you've seen that routine already. We do some of this with the generated interface checking feature, but not to the full extent possible. Ideally you'd not want to require changing the old source, so some sort of automated method would be best.

For example, if you have:

program test
interface
  subroutine sub (x)
  real x
  end subroutine sub
end interface
call sub(3.0)
end
subroutine sub(y)
integer y
end

and compile this with /warn:interface, you get:

t.f90(3): error #8000: There is a conflict between local interface block and external interface block.
subroutine sub (x)
------------------^
t.f90(7): error #6633: The type of the actual argument differs from the type of the dummy argument. [3.0]
call sub(3.0)
---------^

Note that we complained about the disagreement between the explicit interface and the actual procedure. This works even if the subroutine is in a separate source - as long as it was compiled first. Unfortunately, if you put the interface in a module and USE the module, we don't do the check. I have a feature request DPD200022699 filed on this.

FortranFan · ‎03-19-2014

Steve Lionel (Intel) wrote:

What the standard disallows is "interface to self", where in a routine you USE a module that declares the interface to the current routine. This would create semantic complications so it has been rejected by the standards committee. The preferable approach is to NOT have two interfaces but to put everything in modules, though I understand that this can be difficult if you are updating old code.

Are you sure about the C++ method? In all the code I have seen, the prototypes are in file scope (which Fortran doesn't have), but not in the functions themselves. But include files are not the same as modules.

If one wishes to conform to Fortran standards, any discussion on interfaces would eventually lead to SUBMODULEs. I guess that is the standards committee's solution to the problem. However what appears missing is a practical consideration of just how monumental an effort it is to implement SUBMODULEs. And even when first implemented, who knows how long it will take to work all the bugs out so that Fortran coders can reliably make use of them in production code?!

So I wish the standards committee and the compiler vendors would also think of an alternate way to help the situation. Sure it may end up being duplicate once the SUBMODULEs get implemented, but then the language does have some duplicate stuff already e.g., EXTERNAL and PROCEDURE(..) statements.

I've no compiler or programming language development experience, but I sometimes wonder if we can get beyond the current "failure of imagination" and get something along the lines below built into the standard and supported in Fortran compilers:

[fortran]

MODULE My_Interfaces

INTERFACE

SUBROUTINE foo(x)

REAL, INTENT(INOUT) :: x

END SUBROUTINE foo

END INTERFACE

END MODULE My_Interfaces

SUBROUTINE bar(x)

USE My_Interfaces, ONLY : foo

PROCEDURE(foo) :: bar !.. This line, currently illegal, could inform the compiler

! that the current procedure should have same interface as

! foo

REAL, INTENT(INOUT) :: x

x = x + 1.0

END SUBROUTINE bar

PROGRAM main

USE My_Interfaces, ONLY : foo

PROCEDURE(foo) :: bar

REAL :: x

x = 0.0

CALL bar(x)

STOP

END PROGRAM main

[/fortran]

Consider line 11 in above code - I think some approach along these lines can help

resolve the issue of missing "reference to self" aspect in Fortran
avoid the need to maintain TWO interfaces because any inconsistency between the INTERFACE block and the procedure declaration can automatically trigger a compiler error
and can be built into existing code (just need to insert a couple of extra lines)
serve as a standard Fortran equivalent to Intel Fortran's /warn:interfaces compiler feature

Is the Intel Fortran team or the Fortran standards committee thinking along these lines?

Steven_L_Intel1 · ‎03-19-2014

The standards committee has taken this up several times and it has been shot down each time. I don't recall what all the objections were. I think that vendors could attack this in another way that didn't require modifying source - in my view, if you're willing to modify source you're probably willing to put routines in a module. Submodules are to solve compilation cascade issues, which are crippling to developers of large applications. But submodules don't really address the problem you're trying to solve.

FortranFan · ‎03-19-2014

Steve Lionel (Intel) wrote:

... in my view, if you're willing to modify source you're probably willing to put routines in a module. ...

Steve, not necessarily. As you can see from the comments in this thread, coders can also be hesitant to put routines in modules because a) they may end up with massive files which they do NOT like and/or b) significant work is required to resolve circular dependencies that arise from module implementation and they are unwilling to expend that effort. The solution I suggest above can help in such scenarios.

Steve Lionel (Intel) wrote:

... Submodules are to solve compilation cascade issues, which are crippling to developers of large applications. But submodules don't really address the problem you're trying to solve

Steve,

That's not the message I get from the Fortran 2008 standard or from "Modern Fortran Explained" by Metcalf et al. These sources clearly indicate that in addition to solving compilation cascades, SUBMODULEs help with separating procedure interfaces from their implementation. And this is exactly the problem we're trying to solve here. So I disagree - as Ian explains in one of the quotes above, SUBMODULEs would help of great help. Consequently our interest in your "special sauce, EC" product!!!

But unfortunately no compiler I can get my hands on supports SUBMODULEs yet :-(

IanH · ‎03-19-2014

Submodules help indirectly, by reducing the number of valid reasons for using external procedures in new code (what's valid is subject to judgement, but my list of valid reasons is already pretty small) because interface and definition can be robustly separated. But if you still decide that you need to use external procedures, then they don't help.

But... personally... I'd rather see implementations put effort into their "this is what you told me the interface was" and "this is what the interface actually was" consistency checking, rather forcing that check through some sort of language feature. The consistency checking is a more general solution, equally applicable to existing code as it is to new code.

I think use of external procedures should be very much actively discouraged, so I'm not keen on language features that may suggest their indiscriminate use is legitimate.

(My understanding of C++'s requirements are different from Qolin's... the definition of a function also declares a prototype (apologies if my C++ terminology is sloppy - it has been a while) and there is no requirement that the prototype be declared before the function definition. There is a requirement that the function be declared prior to use, but it is quite possible that a particular prototype declaration and the function definition never end up in the same file scope. Anyway, mismatches between prototype and definition have to be permitted - consider function overloading (but you'll typically get a link time error).)

L__Richard_L_ · ‎03-19-2014

I appreciate this very valuable discussion of a vital topic. Right now, I have bought time by packing all the sensitive references into a very large module. This avoids the circularity issue for now, but this program is young and will grow much bigger in a few months, when I will no doubt follow the most likely suggestions. For now, I want to step back and offer more general thoughts about inter-module references.

The compiler has no business resolving inter-module references. The linker should do this, based on the principle that each actor should do what best fits its capabilities. The compiler knows declarations and use-statements. It should resolve the intra-module references and send all other relevant info to the linker. The linker should sort through this info and resolve the references even if this requires multiple passes. Any one of us could easily write the logic for this.

IanH · ‎03-20-2014

Modules provide information to the compiler for when it compiles later program units. The compiler uses the information, communicated to it by the previously compiled module, to be able do things like apply the appropriate operations to a variable, emit the correct object code for a particular call or to know the relative layout of a variable in memory. By the time the linker is involved, the concept of a module is irrelevant, bar perhaps some vestiges in the naming convention used for the symbols that the linker plays with.

Consider your original example. When compiling module M2 the compiler sees some operations on a thing named A1. The fact that this thing is an array, of a certain type/kind and with certain attributes, is all information that you tell the compiler in your module M1. If the compiler hasn't seen your module M1, how is it going to know what to spit out for your module M2 for any operations on A1?

Requiring the compiler to be able to revisit compilation of earlier modules when later modules are compiled seems completely impracticable to implement, plus it would open an absolute minefield of circular dependency issues... array A1 in module M1 has the kind of array A2 in module M2 which has the kind of array A1 in module M1 which has...

L__Richard_L_ · ‎03-24-2014

[to IanH]

I can stand back from the details of the current state and view references in a general way. For example, consider the job of compiling the procedures of a procedure. Procedure P has subprocedures P1 and P2 (in that sequence). P1 calls P2. How does P1 get compiled?

The compiler, when it first works on P1 sets up what it knows about the call to P2 but has to save the reference to P2 for later resolution. After the compiler's first pass at compiling P2, it can go back to P1 and fill in the missing info on the call to P2. This filling-in job could also have been done by a separate actor that was given just the info needed to perform the resolution. The absolute minimum of info needed from P1 would be the name P2 and the location at which to store the address of P2. There would be no difference in the outcome of either the compiler or the separate actor doing this. In both situations a second pass is necessary and sufficient to do the job.

I have worked with linkers that had far more capability than that needed to do the job of resolving cross-module references. For example, the IBM mainframe "Linkage Editor" and its replacement "Binder". In fact, you or I could easily outline the logic for this.

FortranFan · ‎03-24-2014

L. Richard L.,

Are you confusing a Fortran MODULE with an archive library (like on Unix/Linux)? Your comments seem to indicate as such. Fortran MODULEs have specific meaning and use with a compiler and it has nothing to do with linkers. MODULEs provide detailed information on "data" and "methods" to a compiler and it is irrelevant by the time linker comes into the picture.

IanH · ‎03-24-2014

L. Richard L. wrote:
The compiler, when it first works on P1 sets up what it knows about the call to P2 but has to save the reference to P2 for later resolution. After the compiler's first pass at compiling P2, it can go back to P1 and fill in the missing info on the call to P2. This filling-in job could also have been done by a separate actor that was given just the info needed to perform the resolution. The absolute minimum of info needed from P1 would be the name P2 and the location at which to store the address of P2.

The "filling-in job" is more than just resolution. Calling a procedure in modern Fortran involves more than just pushing the memory location of any arguments onto the stack and then transferring execution to some memory location filled in by the linker. Whether P1 can even be successfully compiled, in many cases, depends on the details of P2.

Modules also communicate much more information than just procedure interfaces.

Consequently your separate actor practically looks pretty much like the middle and back-end of a complete Fortran compiler, everything bar the initial lexical and syntax analysis (which itself can't be completed without some ambiguity - consider statement functions) of the source code.

How do you deal with logical silliness such as "...the interface of procedure P1 is the same as the interface of procedure P2 which is the same as the interface of procedure P1 which is the same as...."?

You could probably come up with a series of constraints on source code and some multi-pass whole-of-source compiler that could sort all of this out... but for what benefit? Beyond splitting into multiple modules, from F90 on you can do "forward declaration" of the interface of a procedure using an interface block (which resolves issues with circular procedure references) and in F2008 you can break modules up into submodules (which resolves issues around the module scoped nature of PUBLIC and PRIVATE accessibility). Compiler support for the former has existed for some decades, widespread compiler for the latter is still a work in progress - but it is a lot closer than some sort of revolutionary rewrite of the typical compile and link build infrastructure.

L__Richard_L_ · ‎03-24-2014

This thread has taken a very interesting turn. Experience has shown me that inter-module references can create a lot of work for me, work that the system could do. For example, here is an example from Eclipse workbench (Java) documentation.

[regarding] "Max iterations when building with cycles" ...

"This preference allows you to deal with build orders that contain cycles.
Ideally, you should avoid cyclic references between projects.
Projects with cycles really logically belong to a single project,
and so they should be collapsed into a single project if possible.
However, if you absolutely must have cycles, it may take several iterations
of the build order to correctly build everything. Changing this preference
will alter the maximum number of times the workbench will attempt to
iterate over the build order before giving up."

I agree with avoiding cyclic references and will devote much attention to avoiding them. But, in the cases that admit to a feasible solution, a heavier duty builder (linker) could automate the task.

I also agree with some of the counter-arguments and most of the suggested alternatives. I do not agree with the assertion that my suggested logic is impossible.

Here is a practical situation, one that I have recently confronted. I began working on a very large existing program that had one gigantic module containing almost all the procedures and data. It was borderline non-maintainable. Every procedure could call any other and had access to all module-level data. I quickly broke the program into two modules based on common tasks and a "natural" hierarchy just to see what would happen. Of course, it did not compile because of build order problems. If I had created more modules, the problem would only have gotten worse. Therefore, I will have to use some of the (much appreciated) suggestions of this thread. It will require a lot of work, most of which seems straightforward. However, the resulting program might not have gained much maintainability. I know that a "natural", "maintainable" multi-module build order would be feasible for a multi-pass builder because the original program runs. Unfortunately, such a tool does not yet exist in the Fortran world.

Thanks again for the discussion and all the valuable suggestions.