Software Archive
Read-only legacy content
17061 Discussions

Trouble porting legacy MPI app to MIC

Shereef_S_
Beginner
412 Views

My adviser has a legacy application written in Fortran (mostly) with original dependencies on MKL, MKL BLACS (Intel MPI LP64), and ScaLAPACK. I modified the Makefile to target the associated MIC libraries, and I show those changes below:

Original Lib Dep: -lscalapack -lmkl_blacs_intelmpi_lp64 -lmkl -lpthread -lguide

Modified Lib Dep: -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lmkl_intel_lp64 -lmkl_core -lmkl_intel_thread -lm -fopenmp

The compilation was successful. However, when I run program on the target MIC (4 cores for starters), it fails when making a call to 'pXsygvx', the eigenvalue solver in ScaLAPACK. Reading the documentation for this function, I checked the return value of the 'info' parameter. But the return value is non-sensical (i.e. -1589991488, -1877071040, etc.). A negative return value should mean an invalid matrix value, but these return values seem more like integer overflow to me (just a guess).

Assuming that the application is constructing the eigenvalue matrix correctly (which isn't too far-fetched since the single threaded version runs to completion), then my thought is that there is an incompatibility in the ScaLAPACK library. Alternatively, there's nothing wrong with the library, and the eigenvalue matrix is in fact constructed incorrectly (but then what is the meaning of the 'info' return values?).

Now, I'm no noob, but I am new to the Intel tool suite and libraries. What tools should I use to try and answer either of my questions (is it the libraries fault, or is the matrix really busted)? Also, if you have any theories of your own, I would be very interested to hear them. Let me know if you require additional information.

Thanks in advance!

0 Kudos
7 Replies
Gregg_S_Intel
Employee
412 Views

It should be sufficient to compile ifort --qopenmp -mkl=parallel without need to specify individual libraries.

0 Kudos
Shereef_S_
Beginner
412 Views

Thank you for the quick response. Those flags certainly simplify the makefile. I ran 'micnativeloadex -l' to see the dependencies, the most relevant of which are:

-- libmkl_intel_lp64, libmkl_intel_thread, libmkl_core

I'm assuming that the ScaLAPACK library is wrapped up in there somewhere.

However, I still have the same problem. The application seg faults when calling 'pXsygvx', and the 'info' return value is still nonsense.

Assuming that the use of the above compiler flags eliminates the possibility of discrepancies during link, then it is possible that the eigenvalue matrix really does contain an illegal value.

Any suggestions for debugging on a Xeon PHI?

Thanks in advance!

0 Kudos
jimdempseyatthecove
Honored Contributor III
412 Views

Can you insert some diagnostic code into your program just before the failing call, to call pXsygvx with test data where you know (have forced):

1) arrays to be aligned on 64-byte boundaries
2) arrays are multiple of 16 for real(4) or 8 for real(8)

edit: 2) may need to be a multiple of 16 for real(4) or 8 for real(8) times the number of threads for your test

If the above works, and if the next call fails, what are the differences between the two calls?
If after inserting the test code the next call succeeds, it can be indicative of an alignment issue.

Jim Dempsey

0 Kudos
Shereef_S_
Beginner
412 Views

Byte-alignment. That is an excellent thought.

As a quick check, I ran an example ScaLAPACK program for exactly the function in question (http://www.netlib.org/scalapack/examples/sample_pdsygvx_call.f). The example runs fine for 2 cores, but nothing else (not sure why, since it is calling BLACS_PINFO). Nonetheless, it demonstrates that the call works.

The code I'm working with is a somewhat large, and it's not very straightforward to add a test function, but I'll see what I can do. I'm going to investigate the byte alignment and get back to you as soon as I can.

Thanks again!

0 Kudos
Shereef_S_
Beginner
412 Views

Well, I got it running.

It seems that I did not completely understand the distinction of *.F versus *.f extensions. Apparently, the *.F sources are to be passed through the preprocessor and output as *.f. And since this is legacy code, all *.F sources needed to be re-pre-processed. I did not pick up on the distinction between these file extensions, and merely assumed that one was meant for F90 and the other for F77. Yeah, wrong!

The big clue was in the mysterious 'info' return value. The reported values were such nonsense that it occurred to me they were never initialized, and perhaps the call was never really being made at all. Further inspection clued me in to the distinction between the two file types. So, I re-ran the preprocessor across all *.F files and recompiled. Viola! It runs!

I appreciate all of your assistance. My apologies for not working through this more thoroughly.

Thanks again!

0 Kudos
TimP
Honored Contributor III
412 Views

If you are cross compiling from Windows, ifort -fpp option is expected to accept fpp style macros.  As you may have figured out, .F and .F90 files automatically set -fpp when compiling on linux.  One might expect that -Qopenmp might set -fpp, but you can't count on it.

 Unless your program is dependent on some particular pre-processor, such as gnu tradcpp, you don't need to process .F to .f in a separate step.  If you do need to do that, on Windows, you must use separate folders for .F and .f, as some tools distinguish between them, but most don't.

0 Kudos
Shereef_S_
Beginner
412 Views

I'm running openSUSE and I built my MIC machine myself. I'm still fixing up my total configuration to be more user friendly, but this whole process has been a great learning experience.

I'm more of a C/C++ guy, and I'm picking up Fortran as I go along (since that's what my advisor uses). I presumed that the pre-processor was applied implicitly, but when I realized that the call was never being made, then I eventually concluded that the pre-processor wasn't being applied in the way I thought.

The original developer created a set of modules with compiler directives for varying library calls depending on whether you want LAPACK or ScaLAPACK. I didn't key into the fact that the module was a result of the pre-processor. There is a module 'diag_mod.F' with dependencies on 'diag_r.F', and the resulting module is 'diag_mod.f', with the contents of 'diag_r' "pasted" into it.

I've never used the pre-processor in this way (personally, I try to minimize compiler directives in my code unless there is a good reason for it), so it took me awhile to understand what was really going on. I'll have to set aside some time and make some amendments to the Makefile to address this short coming.

Thanks again for your assistance. I really appreciate you taking the time to help me.

0 Kudos
Reply