After upgrading to the 2017 ifort compiler for Windows I am having a problem with passing an array dimensioned as Array(n) to a subroutine that has it dimensioned as Array(n,1). It seems like the array indices are getting scrambled in the subroutine. The problem doesn't occur in the 2017 32-bit compiler or in any previous versions of the 64-bit compiler.
Has something changed that makes this practice now illegal or is it potentially a bug?
See attached example.
Your program has a bug -- passing default integer arguments to a subprogram that expects INTEGER*2 arguments. The behavior of a non-conforming program is unpredictable. It may change from compiler to compiler, and may be affected by the compiler options used. You should not expect that the (mis)behavior of the program will remain the same when a newer version of the compiler is used.
But if I check the value of Rows and Cols in the subroutine they are correct at 12 and 1. Plus, why are the contents of Array2D correct in the subroutine but the contents of Array1D are not?
And why does the 32-bit compiler work correctly but not the 64-bit?
To answer your questions one has to examine the disassembly of the OBJ files produced by the compilers. I think that it is a waste of time to look at such output when it comes from a program with bugs. How buggy code gets translated is entirely up to the compiler writers.
It turns out that there is an underlying optimizer bug in IFort 17 (32-bit and 64-bit targets) that is alive in your program. Unfortunately, the other issues with your test program distracted me from being alert to the possibility of an optimizer bug. Furthermore, optimizer bugs typically cannot be probed using a symbolic debugger.
I have created a simplified reproducer, and you can find the details at https://software.intel.com/en-us/forums/intel-visual-fortran-compiler-for-windows/topic/685354 .
A workaround for you to consider: avoid using any optimization for those portions of your source code that contain INTEGER*2 variables. I realize that this may not be welcome, but a question to consider would be whether INTEGER*2 variables could be avoided altogether in your code.
Does this mean that although my practice of passing an INTEGER constant to a subroutine that expects it as INTEGER*2 is not recommended, it should have stilled worked Ok if the optimiser bug wasn't present?
The reason I ask is because my application is quite large and it would take weeks (maybe months) to track down and eliminate all the INTEGER*2s completely. It is not just a matter of simply replacing them - there are other implications for my application that make it very tedious.
No, the two issues are not so closely related. Here is a small example to convince you that you have to bite the bullet and fix your code, unless you can establish that the only instances (of passing 4-byte or larger INTEGER constants instead of 2-byte integer arguments) in your entire code are of such a nature that they will work correctly on, say, all little-endian CPUs, and that you have no instances of passing 4-byte variables as actual arguments when 2-byte variables are expected.
program tsti2 implicit none integer b call sub(5,b) write(*,10)' 5 X 2 = ',b call sub(-5,b) write(*,10)'-5 X 2 = ',b 10 format(1x,A10,2x,I10) end
subroutine sub(a,b) implicit none integer*2 a,b b=a*2 return end
The program output, with no optimization (/Od) will probably be quite different from the expected results of +10 and -10.
It is nonstandard to pass a different kind . As an extension, we let you do it for a constant when there's an explicit interface. (If there isn't, we might complain if interface checking is on - I don't recall.) But as mecej4, that's not relevant to the bug. What does matter is using an INTEGER*2 as the loop and array index.
I only ever pass an INTEGER constant to a subroutine that expects it as an INTEGER*2 when it is used as a flag (that affects the logic via IF statements for example in the subroutine) or as an array size value (as per Rows and Cols in my sample code). I never pass it in as a variable, especially one that returns a value to the calling routine as you have have done with variable b in your example mecej4. Until I find time to change all the INTEGER*2s in my application, will the compiler allow my current practice?
Regarding the optimiser bug, is it correct that I can avoid it by using /Od or by declaring local variables I and J in my sample code as INTEGER?
You should be OK as long as the constants passed fall in the range of 16 bit signed integers (-32768 to +32767). If you pass -32769, however, the subroutine will see it as +32767, and that would be a major error in any scientific/engineering application.
Regarding the other problem, i.e., the optimization bug, no guarantees can be given, and you should probably run tests. Build two EXEs, one with /Od and the other with your normal optimization level, and run them on the same input data set and compare the results output. Make sure that the runs exercise code paths containing DO (and, perhaps, implied DO lists) loop control variables of INTEGER*2 type.
At least in the test case I used, making the loop index INTEGER(4) not only avoided the bug but resulted in better code. There's no reason to use small integer variables unless you specifically need the restricted range, as these are not optimized as well.
Integer(2) scalar will always be less efficient than default integer on any platform produced in the last 20 years. Short integers may be useful for arrays, but non vector operations on them will involve local 32 bit copies.
Great, thanks. Just one final thing, is there a web page that lists the known bugs in the latest compiler (such as the optimiser bug), so that users can be aware and work around them?
Sorry, we don't have such a page. Realistically, most of the bugs defy concise descriptions, this goes triply for optimizer bugs. Often there is a peculiar combination of things that trigger a bug and I don't think it would be helpful. Sometimes, if there is an issue that is reported a lot we'll create a web article on the specific issue.
I can't use /Od because it slows down my application too much, plus removing all the INTEGER*2 loop indexes from my code is going to take a long time.
So that I have an interim solution, is it possible to find out more about what is triggering the bug so that I can avoid it? I have many other examples of INTEGER*2 loop indexes in my application that don't seem to have a problem. What is so special about the use of the J variable in my sample code that triggers the bug and why doesn't the I loop index in my sample code also trigger it?
The questions you ask presume that there is a simple cause-and-effect relationship between your Fortran code and the optimizer bug. That is an incorrect presumption. In fact, you noted in #1 that the bug did not occur with the 32-bit compiler. Yet the similar code in my reproducer, referred to in #5, did exhibit the bug in 32 as well as 64 bits.
You can certainly use the older compiler if that can be used with optimization and without this specific bug. You can also use the new compiler, specifying /Od only for those subprograms that cause the bug to appear.
Your last sentence highlights my dilemma - I don't know which routines cause the bug to appear because I don't know what to look for, hence my previous post. I have fixed the few routines that seem to cause the bug but I can't be 100% sure that I have them all unless I know more about what triggers the bug.
I couldn't use the 2016 compiler because it had a bug in the Pardiso routine which I depend on and so I am going to have to go back to the 2015 compiler I think.
To check whether the optimizer bug is active or not you have to run test cases with known and correct results. Such results can be obtained by running with /Od applied to all your sources, and the slow speed of the EXE should not matter if the run still completes in, say, a few minutes. After all, you were able to pinpoint the bug location and provide a cut down test case in #1, and you can use the same techniques.
Is there a reason why you cannot use the 2016 compiler to compile and link with the 2017 MKL library?
Is there a link to the Pardiso bug report that you can provide?