As long as the use of MKL

schulzey · ‎09-16-2016

After upgrading to the 2017 ifort compiler for Windows I am having a problem with passing an array dimensioned as Array(n) to a subroutine that has it dimensioned as Array(n,1). It seems like the array indices are getting scrambled in the subroutine. The problem doesn't occur in the 2017 32-bit compiler or in any previous versions of the 64-bit compiler.

Has something changed that makes this practice now illegal or is it potentially a bug?

See attached example.

Peter

mecej4 · ‎09-16-2016

Your program has a bug -- passing default integer arguments to a subprogram that expects INTEGER*2 arguments. The behavior of a non-conforming program is unpredictable. It may change from compiler to compiler, and may be affected by the compiler options used. You should not expect that the (mis)behavior of the program will remain the same when a newer version of the compiler is used.

schulzey · ‎09-16-2016

But if I check the value of Rows and Cols in the subroutine they are correct at 12 and 1. Plus, why are the contents of Array2D correct in the subroutine but the contents of Array1D are not?

And why does the 32-bit compiler work correctly but not the 64-bit?

Peter

mecej4 · ‎09-16-2016

To answer your questions one has to examine the disassembly of the OBJ files produced by the compilers. I think that it is a waste of time to look at such output when it comes from a program with bugs. How buggy code gets translated is entirely up to the compiler writers.

mecej4 · ‎09-16-2016

Peter,

It turns out that there is an underlying optimizer bug in IFort 17 (32-bit and 64-bit targets) that is alive in your program. Unfortunately, the other issues with your test program distracted me from being alert to the possibility of an optimizer bug. Furthermore, optimizer bugs typically cannot be probed using a symbolic debugger.

I have created a simplified reproducer, and you can find the details at https://software.intel.com/en-us/forums/intel-visual-fortran-compiler-for-windows/topic/685354 .

A workaround for you to consider: avoid using any optimization for those portions of your source code that contain INTEGER*2 variables. I realize that this may not be welcome, but a question to consider would be whether INTEGER*2 variables could be avoided altogether in your code.

Steven_L_Intel1 · ‎09-16-2016

I have escalated this to the developers as issue DPD200414427. If the loop index is made INTEGER(4), I don't see the problem.

schulzey · ‎09-16-2016

Does this mean that although my practice of passing an INTEGER constant to a subroutine that expects it as INTEGER*2 is not recommended, it should have stilled worked Ok if the optimiser bug wasn't present?

The reason I ask is because my application is quite large and it would take weeks (maybe months) to track down and eliminate all the INTEGER*2s completely. It is not just a matter of simply replacing them - there are other implications for my application that make it very tedious.

Peter

mecej4 · ‎09-16-2016

No, the two issues are not so closely related. Here is a small example to convince you that you have to bite the bullet and fix your code, unless you can establish that the only instances (of passing 4-byte or larger INTEGER constants instead of 2-byte integer arguments) in your entire code are of such a nature that they will work correctly on, say, all little-endian CPUs, and that you have no instances of passing 4-byte variables as actual arguments when 2-byte variables are expected.

Main program:

program tsti2
implicit none
integer b
call sub(5,b)
write(*,10)' 5 X 2 = ',b
call sub(-5,b)
write(*,10)'-5 X 2 = ',b
10 format(1x,A10,2x,I10)
end

Subroutine:

subroutine sub(a,b)
implicit none
integer*2 a,b
b=a*2
return
end

The program output, with no optimization (/Od) will probably be quite different from the expected results of +10 and -10.

Steven_L_Intel1 · ‎09-16-2016

It is nonstandard to pass a different kind . As an extension, we let you do it for a constant when there's an explicit interface. (If there isn't, we might complain if interface checking is on - I don't recall.) But as mecej4, that's not relevant to the bug. What does matter is using an INTEGER*2 as the loop and array index.

schulzey · ‎09-16-2016

I only ever pass an INTEGER constant to a subroutine that expects it as an INTEGER*2 when it is used as a flag (that affects the logic via IF statements for example in the subroutine) or as an array size value (as per Rows and Cols in my sample code). I never pass it in as a variable, especially one that returns a value to the calling routine as you have have done with variable b in your example mecej4. Until I find time to change all the INTEGER*2s in my application, will the compiler allow my current practice?

Regarding the optimiser bug, is it correct that I can avoid it by using /Od or by declaring local variables I and J in my sample code as INTEGER?

Peter

mecej4 · ‎09-17-2016

You should be OK as long as the constants passed fall in the range of 16 bit signed integers (-32768 to +32767). If you pass -32769, however, the subroutine will see it as +32767, and that would be a major error in any scientific/engineering application.

Regarding the other problem, i.e., the optimization bug, no guarantees can be given, and you should probably run tests. Build two EXEs, one with /Od and the other with your normal optimization level, and run them on the same input data set and compare the results output. Make sure that the runs exercise code paths containing DO (and, perhaps, implied DO lists) loop control variables of INTEGER*2 type.

Steven_L_Intel1 · ‎09-17-2016

At least in the test case I used, making the loop index INTEGER(4) not only avoided the bug but resulted in better code. There's no reason to use small integer variables unless you specifically need the restricted range, as these are not optimized as well.

TimP · ‎09-17-2016

Integer(2) scalar will always be less efficient than default integer on any platform produced in the last 20 years. Short integers may be useful for arrays, but non vector operations on them will involve local 32 bit copies.

schulzey · ‎09-17-2016

Great, thanks. Just one final thing, is there a web page that lists the known bugs in the latest compiler (such as the optimiser bug), so that users can be aware and work around them?

Peter

Steven_L_Intel1 · ‎09-17-2016

Sorry, we don't have such a page. Realistically, most of the bugs defy concise descriptions, this goes triply for optimizer bugs. Often there is a peculiar combination of things that trigger a bug and I don't think it would be helpful. Sometimes, if there is an issue that is reported a lot we'll create a web article on the specific issue.

schulzey · ‎09-19-2016

I can't use /Od because it slows down my application too much, plus removing all the INTEGER*2 loop indexes from my code is going to take a long time.

So that I have an interim solution, is it possible to find out more about what is triggering the bug so that I can avoid it? I have many other examples of INTEGER*2 loop indexes in my application that don't seem to have a problem. What is so special about the use of the J variable in my sample code that triggers the bug and why doesn't the I loop index in my sample code also trigger it?

Peter

mecej4 · ‎09-19-2016

The questions you ask presume that there is a simple cause-and-effect relationship between your Fortran code and the optimizer bug. That is an incorrect presumption. In fact, you noted in #1 that the bug did not occur with the 32-bit compiler. Yet the similar code in my reproducer, referred to in #5, did exhibit the bug in 32 as well as 64 bits.

You can certainly use the older compiler if that can be used with optimization and without this specific bug. You can also use the new compiler, specifying /Od only for those subprograms that cause the bug to appear.

schulzey · ‎09-19-2016

Your last sentence highlights my dilemma - I don't know which routines cause the bug to appear because I don't know what to look for, hence my previous post. I have fixed the few routines that seem to cause the bug but I can't be 100% sure that I have them all unless I know more about what triggers the bug.

I couldn't use the 2016 compiler because it had a bug in the Pardiso routine which I depend on and so I am going to have to go back to the 2015 compiler I think.

Peter

mecej4 · ‎09-20-2016

To check whether the optimizer bug is active or not you have to run test cases with known and correct results. Such results can be obtained by running with /Od applied to all your sources, and the slow speed of the EXE should not matter if the run still completes in, say, a few minutes. After all, you were able to pinpoint the bug location and provide a cut down test case in #1, and you can use the same techniques.

Is there a reason why you cannot use the 2016 compiler to compile and link with the 2017 MKL library?

Is there a link to the Pardiso bug report that you can provide?

schulzey · ‎09-20-2016

I didn't realise you could easily mix the MKL library from one version with the compiler from another. What is the procedure for doing that? That would be a great solution.

mecej4 · ‎09-20-2016

As long as the use of MKL routines is simple, you can simply link with the mkl_rt.lib of the desired MKL version. (You can even link code compiled by other compilers than Intel's with MKL this way, as long as the calling conventions are compatible/conformable).

After you build the EXE, before running it make sure that the corresponding MKL DLL appears in %PATH% before the DLLs of any other MKL version.

Problem with 2017 Version of 64-bit Fortran Compiler