Array operations operate differently between Fortran 2013 and 2019

netphilou31 · ‎04-17-2019

Hi,

I am currently testing the last version of Intel Fortran (XE2019 update 3) to replace our old XE2013 and I've come into different (to not say strange) behaviors. Below are two main issues:

Example #1:

In a source code I have a named common block which contains an array correctly dimensioned somewhere else but in this source code it is simply declared as A(1) (it's the last variable of the common block). In the executable statements it is assigned to another array (dummy argument explicitly declared, let say daidn(nc)). The executable statement is daidn(:) = 2.D0*A(:). With IVF 2013, all the nc components of array daidn() get a value based on A(1:nc) but with IVF 2019 only the first element of daidn is calculated, like if A was really containing a single value !

Example #2:

In another source code I have two arrays ; the first one dRatedN is dimensionned with 200 elements, the second one, dratedx is dynamically allocated to (nc,nrc) (with nc < 200). When affecting dratedx(:,j) = dRatedN, it seems that all the 200 elements of dRatedN are copied to dratedx, leading to an access violation because of memory overflow. However this is correctly working with Intel Fortran 2013 !

I can understand that, in example #2, the code writing is not very clear nor rigorous and can easily be improved, and leads to an access violation that can be investigated, however it is very difficult to identify the problem arising in example #1 (it took me some hours to be able to identify it).

I am currently tracking other portions of code that can contains these types of code writing but I would like to know if these problems can be considered as a compiler bug or not.

Thanks,

Phil.

FortranFan · ‎04-17-2019

netphilou31 wrote:
.. I would like to know if these problems can be considered as a compiler bug or not.

Your description suggests code that does not conform to the Fortran standard but which was supported by an earlier version of Intel Fortran compiler and not by update 3. Intel support can best determine for you whether it is a compiler bug in terms of a non-standard feature that they intend to support. If you can share a reproducer, some readers here might be able to provide you some guidance. Intel support might also elaborate on any changes in Intel Fortran compiler that impact the run-time behavior, and if any setting(s) such as compiler option (e.g., /assume) can retain earlier behavior.

jimdempseyatthecove · ‎04-17-2019

From your description in example #1, as far as the compiler knows, array a has a dimension of (1). You should use daidn(:) = 2.D0*A(1:nc)

*** .AND. have array bounds runtime checks off, as well as cross your fingers that some other same named common block correctly places A with at least nc elements.

For #2

Fortran 2003 2.4.5 7-7: The shape of an array is determined by its rank and its extent in each dimension
Fortran 2003 7.1.5 2-3: Two entities are in shape conformance if both are arrays of the same shape
Fortran 2003 7.4.1.2 20-21: Either variable shall be an allocatable array of the same rank as expr or the shapes of variable and expr shall conform

Consider using

dratedx(:,j) = dRatedN(1:size(dratedx, dim=1))

*** and hope that dRatedN always has at least that amount of data

It may be a good idea to use some asserts (conditionally compiled) to assure dimensions are what you expect them to be.

Jim Dempsey

netphilou31 · ‎04-17-2019

Hi,

Thank you for your comments. I fully agree that, especially in the example #2, the code is not properly written nor easily understandable (fortunately dRatedN is always oversized as A is in example #1). However, example #1 is an illustration of a practice that I learned decades ago (when I was a student) mainly because array A belongs to a common block in which it is properly dimensioned in the main routine (I'm talking about the declaration of the size of A and not the array assignment). In all other procedures that use this array, it is always declared with dimension 1 and if the assignment is part of a do loop, it works correctly. I also agree that these are aged practices (the code was originally developed in the mid 80's), but why change a code that works? The use of array assignment as dratedx (:, j) = dRatedN (:) is part of recent developments (compared to the original code) or practices we started to use recently (badly for some of them) but that worked perfectly in IVF2013 (or maybe the compiler added some array size checkings before applying the assignment, I don't know). This also could have led to buffer overflows in example #2 but this wasn't showing any crash or access violation messages in IVF2013.

I will tell other developers not to use these bad practices and will try to find other such problems in our source code before migrating to IVF2019 (we have a lot of files that are used to do non regression tests). Unfortunately, the writing is sometimes developed by partners (mainly university students) and it can be very time consuming to check every line of code before integration, especially when the code is thousands lines long.

Thanks and best regards,

Phil.

mecej4 · ‎04-18-2019

Phil, I constructed a test code to model your description of Example 1 (code below). The code violates the Fortran standard, and several compilers are able to catch the violation (with suitable options for checking). A capable and omnicient compiler would reject the program. The Fortran standard does not require the compiler to detect this error; of course, it would never tell us what should happen if the compiler produces an EXE from the defective source code and you run that EXE. The argument -- that an old version of one compiler produced results that you like and other compilers should generate the same results -- is weak. Your code is broken; time to bite the bullet. The current version of IFort may help you with this task when used with /check:all -- see below.

We can set the standard aside and use the "consumer is right" argument. Did Intel document what the old compiler would do with this code? Do other users of Intel Fortran use and demand such behavior? I think not, and so I don't think that you can name this as a compiler bug at all.

program nphil
implicit none
integer i,ijk,ia(5),ja(5)
common /np/ijk,ia
!
ijk=3
do i = 1, 5
   ia(i) = i*i*2-ijk
end do
call sub(ja,5)
write(*,'(5i12)')ja
end

subroutine sub(ka, nc)
implicit none
integer nc,ka(nc)
integer i,ijk,ia(1)        ! '(1)' instead of '(5)', not consistent
common /np/ijk,ia
ka = ia
return
end

When the main program and the subroutine are together in one file, some compilers reject the program. When the two are in separate files, no error detection is possible at compile time, and whether you see run time errors depends on the compiler and the options used.

IFort 16: -1 5 15 29 47

IFort 17, 18 and 19: -1 0 0 0 0 (default options)

IFort 19: forrtl: warning (406): fort: (33): Shape mismatch: The extent of dimension 1 of array KA is 5 and the corresponding extent of array IA is 1 (compiled with /check:all)

Lahey LF95 7.1: jwe0329i-s line 6 Two entities must be in shape conformance (ka,ia). (run time error, with checking)

-1 0 0 0 0 (result, no checking)

NAG Fortran: Error: netphilou.f90: Inconsistent definitions of COMMON block NP in program-units NETPHILOU and SUB (source in one file)

Runtime Error: nsub.f90, line 6: Rank 1 of IA has extent 1 instead of 5 (separate files, checking requested)

-1 5 15 29 47 (separate files, no checking)

Gfortran 6.3: Warning: Named COMMON block 'np' at (1) shall be of the same size as elsewhere (24 vs 8 bytes) (single file)

-13248 0 469232 6 0 (separate files, no checking)

netphilou31 · ‎04-18-2019

Hi mece,

Thanks a lot for your time checking my problems with other compilers. It's always nice to have a different point of view and stay humble in front of adversity especially when your trust are shaked. As I can see from your trials, the compiler behavior vary form version to version or from one provider to the other in interpreting non standard programming techniques. Normally, when assigning an array slice to another variable, I check that the ranges are the same in writing the explicit ranges on both side of the equality sign, however I found this problem on a section of code I did not write personally and that was performing differently between IVF2013 and IVF2019. Most of our legacy code contains simple do loops to copy array elements but, as I mentioned before, recently we have started using more advanced programming techniques like array assignments and rather than making making the life easier it can sometimes lead to nightmares when not correctly used (whereas a simple do loop would have worked perfectly).

I will try to pay more attention to this in our future developments and I am going to track these type of things in the actual code; this cannot hurt. I will also try to use the /check:all compiler directive.

Thanks again and best regards,

Phil.

FortranFan · ‎04-18-2019

netphilou31 wrote:
.. example #1 is an illustration of a practice that I learned decades ago (when I was a student) mainly because array A belongs to a common block in which it is properly dimensioned in the main routine (I'm talking about the declaration of the size of A and not the array assignment). In all other procedures that use this array, it is always declared with dimension 1 and if the assignment is part of a do loop, it works correctly. I also agree that these are aged practices (the code was originally developed in the mid 80's), but why change a code that works? ..
.. Unfortunately, the writing is sometimes developed by partners (mainly university students) and it can be very time consuming to check every line of code before integration, especially when the code is thousands lines long. ..

@Phil,

Will it be possible for you to explain the coding practice in Example #1 and its rationale? You write it dates back to 80s: was there some need back then - hardware/software - to do so? I've done some work with "FORTRAN" legacy code starting with my thesis advisor's that was developed in 80s or 70s but all of them THANKFULLY used INCLUDE files when it came to working with data in named COMMONs. You don't mention INCLUDE files: I assume you are aware of this facility which is part of the Fortran standard - it'll be very surprising if any Fortran compiler today does not support INCLUDEs - so is there a reason why you don't use this?

You ask, "why change a code that works?" Does this not apply to IVF 2013 as well since it works? Just curious as to why you want to change to IVF 2019, especially if you have non-standard code which you do not want to change?

Also, if you are developing code with multiple partners such as academia, why not utilize multiple compilers, especially NAg Fortran which is very good at this, to at least "ftnchek" the code you receive for errors and warnings?

C:\Temp>type np.cmn
   integer ijk, ia(5)
   common /np/ ijk, ia


C:\Temp>type sub.f90
subroutine sub(ka, nc)
   implicit none
   integer nc,ka(nc)
   include "np.cmn"
   ka = ia
   return
end

C:\Temp>type nphil.f90
program nphil
   implicit none
   include "np.cmn"
   integer i, ja(5)
   ijk=3
   do i = 1, 5
      ia(i) = i*i*2-ijk
   end do
   call sub(ja,5)
   write(*,'(5i12)')ja
   stop
end

C:\Temp>ifort /c /check:all /warn:all sub.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64,
Version 19.0.3.203 Build 20190206
Copyright (C) 1985-2019 Intel Corporation.  All rights reserved.


C:\Temp>ifort /c /check:all /warn:all nphil.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64,
Version 19.0.3.203 Build 20190206
Copyright (C) 1985-2019 Intel Corporation.  All rights reserved.


C:\Temp>link sub.obj nphil.obj /subsystem:console /out:nphil.exe
Microsoft (R) Incremental Linker Version 14.16.27027.1
Copyright (C) Microsoft Corporation.  All rights reserved.


C:\Temp>nphil.exe
          -1           5          15          29          47

C:\Temp>

jimdempseyatthecove · ‎04-18-2019

Phil,

>>The use of array assignment as dratedx (:, j) = dRatedN (:) is part of recent developments (compared to the original code)

Converting from: do i=1,nc; daidn(i) = 2.D0*A(i); end do to daidn(:) = 2.D0*A(:), per specification required you (the programmer) to assure that both sides of = conform. The fact that a particular vendor's version worked, placed you "standing on thin ice" so to speak.

When converting old code (and style), to new (and improved), has to be done with great care. These old programmers typically knew what they were doing. Considering the old code did not have the use of allocatable arrays, to get some flexability, the actual array sizes often had slack space (padd), and relied on the DO loop to reference the correct data. Changing the code to use (:) broke the intention (design) of the original programmer. The correct "modern" replacement is:

daidn(1:nc) = 2.D0*A(1:nc)

Jim Dempsey

netphilou31 · ‎04-18-2019

@FortranFan,

I do know that INCLUDE statements exits, but they were not used in the original developments made in the mid 80's. Of course, they are useful in the way they do not require to rewrite the declaration lines each time and you are sure that the dimension are correct. In the original source code, the arrays belonging to common blocks (as long as they were the last variable) were always correctly declared in only one source code and the others were simply declaring a dimension (1). It was the same for dummy arguments declarations, even if an array is dimensionned to N in the calling routine, it was declared with a size of (1) in the called routine because N was not always passed as argument (N was passed by common). I don't want to go into the reasons that led to this mainly because I don't know them except that the skeleton of the code was automatically generated and not manually written, but it was written that way and worked over decades with no problems. This is also why using the /check:all compiler directive is absolutely not possible because it will issue error messages all the time. To come back to INCLUDE statements, personally I absolutely don't like this technique especially when it comes to debugging purposes because you never see the code which is embedded in the include files, I rather prefer to use modules which I agree don't allow to see the code either but are a more powerful feature. I also try to get rid of common block because of multithreading and threadsafe needs.

Finally, to answer to your question about why to switch to IVF2019 whereas I have a code which runs correctly in IVF2013, the answer is quite simple: we have paid during several years to get updated version of the compiler that we never used. It's now probably time to use a more recent version of the compiler and also a better IDE as VisualStudio 2010 starts becoming aged. Intel Parallel Studio XE2019 also offers probably more powerful analysis tools than the 2013 version.

Best regards,

Phil.

FortranFan · ‎04-18-2019

netphilou31 wrote:
.. I do know that INCLUDE statements exits, but they were not used in the original developments made in the mid 80's.
.. about why to switch to IVF2019 .. the answer is quite simple: we have paid during several years to get updated version of the compiler that we never used. It's now probably time to use a more recent version of the compiler and also a better IDE as VisualStudio 2010 starts becoming aged. Intel Parallel Studio XE2019 also offers probably more powerful analysis tools than the 2013 version. ..

Thanks for your detailed explanation.

Per my suggestion in the first post, you may want to submit a support request with Intel: https://supporttickets.intel.com/servicecenter?lang=en-US. They may be able to point you to some option to overcome the "compiler regression" if they accept it as such or suggest some workaround to minimize the impact with your code such as in Example #1 with IVF 2019, if they have something.

netphilou31 · ‎04-18-2019

Hi,

If the coding is not compliant with Fortran standard, I am not sure that Intel support can take this as a compiler regression but I can try to explain my findings to them and see their answer. I also prefer to be sure that there is no compiler interpretation behind the scene to ensure that the coding conforms to standard and is right, this is also a guaranty of reproductability whatever the version or the compiler or the platform.

Thanks again

Phil.