Calling Fortran routines from Matlab, Python, R. (Memory related)

Simon_C · ‎11-24-2022

Everyone,

I have a question that relates to calling Fortran routines from a dll, from Matlab, Python and R. The question is not about how to do that (others are writing the necessary 'wrappers' for me) but rather about the use of memory, and what information is retained between calls from the dll.

The Fortran exectable program that I am adapting (quite big, there must be about 60 modules) normally runs from the command line in this way:

1. Read information about the type of problems it will be solving, from a file.

2. Use the above information to allocate various arrays and derived types throughout the program, and assign values of various integer scalars that will be used to set the sizes of local arrays in subroutines. This is done just once. (The allocation of arrays occurs throughout the program, and is done on the first calls of the relevant routines.)

3. Read the data for an individual problem from another data file, solve it, write out the results. Repeat these steps (read-solve-write) for as many problems as are present in the data file, which could be any number from 1 to several hundred.

4. Stop when there are no more problems to solve, and exit.

BTW: I do not deallocate arrays at the end, and I don't use pointers.

It isn't hard to write a subroutine so that the code can be run just by calling that routine: the information in (1) above can still be read from a file, input data for individual problems can be provided as arguments to the subroutine, and results can be returned in the same way. This is done one problem at a time.

The first thing I am not clear about, if this subroutine is called from one of the programming environments I've mentioned, is what happens to the allocated arrays and other SAVEd information 'within' the Fortran code *after* the first call of the subroutine. Can I assume, when calling the routine again to solve a second problem, that they all still exist? That is to say, that the setup steps in (1) and (2) above only need to be done on the first call from Matlab etc. in the same way as they would in an executable program?

I realise the above isn't a pure Fortran question, but I'm hoping the answer might be obvious to someone with a better grasp of computer science than I have.

There is a second, related query: supposing the user has been running my code within Matlab, etc, and he/she decides to solve a different type of problem from the one they have just run, such that the initial setup steps (1, and 2) need to be repeated. Am I right in thinking that all I have to do is ensure that the previously allocated arrays are deallocated first? I believe this would work (and I can certainly test it) within a Fortran program but, again, I'm not sure about these other enviroments. Unfortunately I don't use them myself, and need to rely on testers.

Any assistance/advice will be greatly appreciated. Thank you.

Simon_C.

JohnNichols · ‎11-24-2022

Do it all in Fortran, the other programs will just slow up the process.

Simon_C · ‎11-25-2022

John - "all Fortran" would be my own preference, and the Fortran executables of my programs will be the most capable versions, but users like these other software platforms so I have to try and provide for them too.

JohnNichols · ‎11-25-2022

The best way to find out is to try a small sample and see what happens. This is common with reading devices, you have to read and see what you get as the manual is often seven iterations behind. Python and R are easy to get and not a steep learning curve, MATLAB now that is a dog of another kind.

You will be hard pressed to find someone who has done what you want done here, most of these people are pure Fortran. Sorry.

FortranFan · ‎11-24-2022

@Simon_C ,

See the thread below from yore that shows a way to consume C++ classes in Fortran code:

https://community.intel.com/t5/Intel-Fortran-Compiler/Calling-C-cpp-objects-from-a-Fortran-subroutine/m-p/1110557/highlight/true#M129102

Note the same idea can apply in reverse and you can make use of it to have your customers (which may be you yourself) consume your Fortran DLLs from other programming languages and platforms such as MATLAB, R, Python, etc., basically anything that allows C language calls as their interoperability layer - note MATLAB, R, Python all do.

So then, what you can try is refactor your Fortran program(s) as "class"(es) as per the object-oriented paradigm. That is, as derived types with suitable methods that operate on the "data" in the types, either as module procedures, or preferably type-bound procedures. You can then introduce "wrapper" procedures that are of BIND(C) clause, meaning they can interoperate with a C companion processor to Fortran. These "wrapper" procedures are what MATLAB, R, Python users will use. These "wrapper" procedures are of 3 categories: "create", "exercise"/employ, and "destroy". as shown in the above link. There may be multiple such procedures, particularly in the "exercise" category for various tasks can be done by your Fortran program(s). You can then try out the code in the above link and see how it does these with a simple stack of integers. That simple stack can be gazillion lines of Fortran code all encapsulated in Fortran derived type(s) instead of C++ and the idea would still apply.

The callers can then have many instances of the Fortran "class(es)" in each platform or whatever i.e., working on different problems simultaneously or even in parallel! Each instance of the Fortran "class(es)" i.e., derived types may tap into resources such as powerful, heavy-duty databases or simple files on a given computer and set up the compute tasks to be performed, usually this will be during "Create" category mentioned above. And during the exercise category procedure executions, it may report out data. If there is risk of the same resource getting tapped into at the same time, you have to take the suitable steps (say critical section) or synchronization to ensure there are no data locks.

jimdempseyatthecove · ‎11-25-2022

>> If there is risk of the same resource getting tapped into at the same time, you have to take the suitable steps (say critical section) or synchronization to ensure there are no data locks.

Yes, but caution, critical sections within the Fortran DLL will not cooperate with critical section (mutex) within the calling application. IOW use a resource critical section in only one domain or the other. As to which (assuming both/all domains may need this functionality), this you will have to think about very carefully. IOW necessary critical sections that are improperly constructed often are not detectable during program testing, and only show up much later.

Consider:

Fortran DLL needing critical section for resource fubar
App A calling Fortran DLL that uses fubar critical section at point x in code as well as requiring fubar critical section at point y in its own code
Concurrent with App B requiring fubar critical section in its own code.

To further complicate matters, you may not have access to App B code.

Jim Dempsey

Simon_C · ‎11-25-2022

Thank you FortranFan, Jim,

I like the idea of separate wrappers for "create", "exercise"/employ, and "destroy", but... I am where I am, and single wrappers are being created. I don't have a sufficient understanding of what FortranFan has suggested to go down that route, even if I had the time (for refactoring especially). So, can the two questions that I asked be answered directly? If they don't make sense, or indicate some incorrect assumptions on my part, I hope someone will explain.

Simon_C