Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

IFX

JohnNichols
Valued Contributor II
1,777 Views

Many years ago, with a lot of help from the good people on this site, one in particular,  we got a structural analysis package running based on code from two good sources. 

I was looking at a unique engineering problem in Europe that for the first time allows me to use the full range of the package, beams and plate elements in a close approximation to reality.  

I have a test file that I am slowly working on to set up a sound model, I run the model as I add a few nodes so I can check the engineering data.  Very easy to make a mistake and blasted hard to find if you try all at once.  

I have been using IFORT, along with Pardiso and the Eigensolver.  I get the following answers, Screenshot 2022-08-08 100243.png

If I make a change to IFX I get the following answers, 

Screenshot 2022-08-08 100154.png

The difference is not exactly 2, but about 1.9 +- 0.2 in the ratio.  

When I compile with IFX I get 

 

image_2022-08-08_102128129.png

I realize IFX is a work in progress, I have no idea how to find out, what these errors mean in terms of line numbers,  I still have a day of work to set up the final model,.

I am merely pointing out the problem.  

I have a TB or so of measured data, so I will shortly know the real measurements. 

John

0 Kudos
41 Replies
jimdempseyatthecove
Black Belt
1,297 Views

John,

Those are "remarks" not "errors" (it says so on the print out following the filename.

Apparently, your build configuration is specifying some diagnostic "check". 

Not sure which one(s) it is(are). Bounds, temporary, ???

 

Jim Dempsey

JohnNichols
Valued Contributor II
1,293 Views

It would be nice to be sure or get a little bit better note.  

This is a cantilevered slab from a river edge, designed to hang from an arch --  LOL what is wrong with flat boring concrete bridges, they work.  

I am out to 28 metres and the results are closer.  

image_2022-08-08_110720526.png

 

 

 

Screenshot 2022-08-08 110250.png

 

 

jimdempseyatthecove
Black Belt
1,282 Views

Not sure what is causing the numerical differences between ifort and ifx.

You may need to insert some diagnostic prints to locate the source of the differences.

Hint:

Use fpp (Fortran PreProcessor)

Insert PRINT/WRITE to trace progress to trace log (either redirect console output via Borr.exe>log.txt, or to your selected log file with WRITE)

Annotate log line with fpp directives __FILE__ and __LINE__ (include step as well as value you are tracing).

(you can omit the __FILE__ if you trace only from one file)

(you can omit __LINE__ if you only trace from one line in one file)

 

You can then use WinDiff.exe or your favorite difference program (I prefer Beyond Compare).

 

After you locate the point of divergence, you can use the debugger to break on the particular line and iteration of divergence. e.g.

Insert

  IF(I == 1234) then ! I== loop control var, 1234 is your iteration number of interest

     print *,"break here"

  ENDIF

 

Note, if you make the 1234 a module (aka global) variable, then you can edit it for a different set point.

 

Jim Dempsey

Barbara_P_Intel
Moderator
1,278 Views

What compiler options are you using? There are some differences in the optimizations performed between ifort and ifx.  See Porting Guide for ifort Users to ifx.

jimdempseyatthecove
Black Belt
1,277 Views

John,

I happen to notice that the ifort build is running ...\borr\Debug\Borr.exe   (32-bit build)

ifx is running ...\borr\x64\Debug\Borr.exe  (64-bit build)

 

Also see if one of the builds (ifort) is generating code for the FPU as opposed to the AVXnnn

 

Jim Dempsey

JohnNichols
Valued Contributor II
1,228 Views

Jim et. al..

I want to run a model much larger than any I have run in this analysis package before this time, with combined plates and beams.  

I was wasting a minute looking at IFX, I did not expect that big a difference.  It is simply an observation, ie do not use IFX for anything except toys.  

The blasted Pardiso is running into an overload with a matrix bigger than 144 by 144.  I have sworn to all the gods.  I need to get the model running as I am on a deadline and you can play with Pardiso for ages, so I am dragging out an old Conte and deBoor solver from 1975 and seeing if I can just slip it in for Pardiso for the moment.  

It is just fun and thank god for one's friends. 

John

Ron_Green
Moderator
1,271 Views

the warning message is about -check options.  Check those, IFX hasn't implemented all of the ifort -check options. 

IFX vectorization is nowhere near that in ifort at this time.   You can get closer to ifort by using /Qxhost on a genuine Intel processor.

The next update release will have no less than 479 major edits over the previous update to bring it closer to ifort in features, performance, and optimizations.  IFX is a work in progress this year.  with the v2023 release you'll see a much different IFX than you see today. Even with this next update you will see a step function in IFX features and performance. 

JohnNichols
Valued Contributor II
1,226 Views

Thank god I am not writing IFX, those guys must be going nuts.  

 

Only an interesting human uses anything but an Intel Processor in my humble opinion. 

You do realize that most of the major silicon foundries are close to and in Taiwan ie Japan, South Korea and Southern China, like all the major ones.  

Who builds an empire on a bed of sand, quick sand?  

 

PS The Germans can teach you to build mission-critical stuff in a deep bunker.  

mecej4
Black Belt
1,197 Views

I have not seen the source code of the program and nothing that I have seen in this thread leads me to suspect that either Ifort or Ifx is responsible for the unacceptably large variations in the program output.

I suspect, on the basis of similar posts in this forum in connection with medium size old Fortran codes, that your current version of the program contains many common bugs such as

  • Uninitialized variables
  • Array bounds exceeded
  • Mismatched subprogram arguments
  • File accessed simultaneously with different unit numbers
  • Allocatable arrays being used prior to allocation
  • OCR errors
  • Variables that the old program assumed to have the SAVE attribute not being given the attribute in the current code

Without having access to the source files, and a test case with known results for verifying the correctness of the program, I think that any conclusions regarding the Ifx compiler are premature and unfair. You could use some other compiler, say, Gfortran, and the results could be something else. I am not pleased to be so skeptical, but experience guides me to be cautious about prematurely drawn conclusions.

Until the program is known to work correctly, appealing to your engineering sense (regarding bridge vibrations, etc.) to interpret or explain the wrong results is a waste of time. Physics does not know about undefined quantities that may have any value.

JohnNichols
Valued Contributor II
1,177 Views

Thank you for your comments.  

I was not going to supply the code until I had an alternative inversion program running. 

This is the structural analysis program from Harrison, that you helped me in 2015 to include Pardiso and the eigensolver.  The code has been amended to add a lot of check features, same node numbers etc, but it is for all intents and purposes the program you made such significant corrections on.   It works against Harrison's standard models.  It works against the standard plate problems from the Colorado professor. 

My thoughts are that there is a mistake in the program that needs to be found once I can solve a complete problem. At the moment, I cannot solve the complete problem as Pardiso stops at 144 nodes in a structural program.  Again that is likely a mistake in the code, but I need the answers for the bridge without caring about the solver.  

I also did not want to bother this group with this code until I had a good look at it, it is extensive.  

I am slowly working through the code to look for all errors, there are always errors, the issue is finding them.  

I was given a picture of a bridge, it is an unusual arch bridge with a hanging concrete slab.  One uses ULARC to do a quick simple plastic analysis to determine if the approximate member sizes measured from the picture are close. ULARC gives you the deflections and the moments - ULARC is from UCB - Powell.  ULARC is very fast to solve problems and gives you a safety factor, in this case, 2, which is ok.  This is 2 hours of work. 

One uses Borr, this program, to determine the deflections and the eigenvalues. This is 2 days' work.  I ran into the problem that Pardiso is throwing a -2 error on the 144 by 144 matrix. I need a 200 by 200 matrix. It would be nice to take a few days and look at the problem, I do not have a few days, so it is simpler to slip in an old solver, get the answers for the eigenvalues and check ULARC and BORR against each other.

One then sets up a model in a difficult commercial program, Strand 7 which requires significant time and is difficult to amend with its GUI, this is 4 days of work.  STRAND 7 will give me forces, deflections and eigenvalues.  Strand 7 gives answers. How close to reality they are, is indeterminate, we know structural models are poor reflectors of real data.  I have a lot of actual data I can then compare to the model, the point is to determine the element of the bridge that generates each eigenvalue. So at least two programs are used to check all elements of the model.  

The difference of 2 in IFX and IFORT suggests I have an error in the code, I will find it, likely I will ask Jim and you kindly to look at it, but I do not want to waste your time now, so I did not supply the code.  I am sure you will find errors, you always do and I appreciate the assistance.  I have no doubt IFX will be a huge step forward, my minor issues will be resolved and we can check the safety of the bridge.  

The programs are tested against a standard bridge in the central part of the USA with known eigenvalues, we have 3 years of not-so continuous data, and we are trying to find the rate of change of E with time, at the moment the statistics suggest it is one part per 100 million per second, the issue is not the seconds in the day, the issue is the seconds in 50 years. 

There is only one place in the world to ask questions about Fortran code that will run in real-time on a NUC on a bridge to monitor the change, it is here.  This is without doubt the preeminent group.  I am so thankful you put up with me. 

John

 

Again, thank you for your comment.  

To provide a graph of time versus return for ULARC, BORR and STRAND7

image_2022-08-09_090120391.png

 

 

 

jimdempseyatthecove
Black Belt
1,179 Views

Or...

On the ifort (32-bit) build...

... you are using the x87 (FPU) options: Qpc32, Qpc64, Qpc80 (QIfst, Qrcd, Qrct, ?rounding-mode:chopped)

... you will have instructed the compiler to generate x87 FPU code as opposed to SSE, AVX, AVX2 or AVX512 code

Note, my presumption is ifx generates only SSE, AVX, AVX2 or AVX512 code and thus would ignore (?warn) the x87 FPU related options.

 

The x87 FPU can be set to generate greater (or lesser) or same internal precision than the SIMD floating point system, but your deviations in results data appears to be much larger than these differences. Though high iteration counts could generate significant divergences in results data (when using different precisions).

 

Of particular note is to be cautious of square root result differences between x87 FPU and SIMD. The x87 FPU favors greater internal precision (80-bit FP/64-bit mantissa) trading off speed for accuracy, whereas the SIMD can generate much lesser precision (14, 23, or 28 bits) as well as having instructions that perform 1/sqrt(x) trading off accuracy for speed. 

 

This said, I agree with mecej4 "nothing that I have seen in this thread leads me to suspect that either Ifort or Ifx is responsible for the unacceptably large variations in the program output"

 

Jim Dempsey

 

Ron_Green
Moderator
1,144 Views
jimdempseyatthecove
Black Belt
1,163 Views

RE: I have no doubt IFX will be a huge step forward

The main purpose of IFX is to be able to integrate GPU offloading as provided with the newer OpenMP capability.

This includes the Intel Arc GPU and/or Xe-HP, Xe-HPC, Xe-HPG

(grudgingly nVidia and others will be supported too).

aka Data Parallel Programming support via SYCL API

Note, NUC CPU's internal graphics processor is capable of supporting SYCL (but it has a limited number of vector processors and limited vector width). You could possibly get a few more "cores" worth of CPU performance but with a significant amount of work. If (when) you migrate to the Xe series or possibly Arc series on desktop/workstation it might be worth the programming effort.

 

From the Fortran on Intel or AMD CPU, ifort is a better choice today.

 

RE:  at the moment the statistics suggest it is one part per 100 million per second, the issue is not the seconds in the day, the issue is the seconds in 50 years.

100m * 60 * 60 * 24 * 365 * 50 = 157,680,000,000,000,000 (~18 decimal digits)

binary: 1000110000001100010001111111100100001011110000000000000000

58-bits of precision (not counting fractional bits)

Your larger period accumulators may need to be REAL(16) such as to not lose significant bits.

Double precision has 53 significant bits (which integral and fractional bits share) or 15.95 decimal digits

Quad precision (aka REAL(16)) has 113 significant bits or 34.02 decimal digits

 

Jim Dempsey

JohnNichols
Valued Contributor II
1,156 Views

I have absolutely no doubt that the error is my code, finding it is the interesting part, and asking questions is the first way to find out.  

The first critical question is - are there any current problems with ifx that might cause the errors? 

Based on the response the answer is no.  So one moves on.  But you always ask the null question, basic science.  

We can track down the problems once I have a standard problem that can be solved, checked against standard programs and we can compare it to real data.  

A theory without data is like Einstein's relativity theory until 1919 it was a theory and then some guy in Arizona I think collected the photo that proved it.  

 

jimdempseyatthecove
Black Belt
1,143 Views

>>... 58-bits precision ... double precision 53 bits precision

at 50 years, using DP, a month's worth of data added to an accumulator would not attribute any change to the accumulator.

at 25  years, using DP, two weeks worth of data added to an accumulator would not attribute any change to the accumulator.

...

 

Also, try adding build option:

 

     /Qinit:arrays,huge

 

(arrays and scalars)

 

IOW you want the program to fail early (including integer variables). This should generate Infinity's for FP/DP and integers will overflow (to negative) or index way out of bounds. Both, hopefully aborting the program at the earliest point (where you can trap it in debug mode).

Initializing to zero might hide the error, snan has no effect on integers.

 

Jim Dempsey

 

JohnNichols
Valued Contributor II
1,131 Views

Jim:

Thanks for the thoughts.  

I enclose the borr.zip file.  

This program has beam and plate modules.  The beam module has been used a lot, the plate module has only been used a few times on very small test problems a long time ago. 

There are two data files. Probe.inp is the largest model I can get that will run and give me some answers.  I am not trusting these answers, it is merely a step in looking at the long-term development of a procedure, but the simplicity and quickness of this method means that the method is attractive in the long term.  

ProbF.inp causes all sorts of errors to pop up.  

The problem is a flat plate, 10 metres wide and 48 metres long, it is attached to a bank and sits over the river. At the moment, there are hangers that are assumed to be fixed in space to hold the slab.  Each strip of the bridge is broken into 3 triangles. 

ie long arch, flat slab below on steel wires.  

ProbF interesting has shown an error in assembling the global stiffness matrix, this is likely a counting problem and fixable.  

write(*,*)a1,b1,a2,b2
            if(a1 .gt. N .or. b1 .gt. N  .or. a2 .gt. N .or. b2 .gt. N) then
             write(*,*) 'matrix overflow'
            else
            sk(a1:b1,a2:b2) = sk(a1:b1,a2:b2) + asm(1:6,1:6,counter(h1))
            end if

 

In saveSM this traps the error, although I do not stop, just do not add the local matrix.  

 

But probF causes FEAST in doing an analysis to report a Pardiso error.  

 

I use Pardiso to do the matrix inversion, it appears from reading this morning that Feast uses Pardiso as well.  

Pardiso can handle problems bigger than 150 by 150, so I am at a loss as to the cause.  The program has a lot of Fortran-write statements. 

Thanks 

John

 

 

jimdempseyatthecove
Black Belt
1,120 Views

John,

On my NUC, with oneAPI Fortran 2022.1.0.256

I get mkl_intel_thread.2.dll was not found.

This is running your built Borr.exe

Must have bunged up path?

Researching...

If I try building, I get Error: The operation could not be completed.

Exiting MSVS then restarting...

Build succeeded, must have been 1st/bad run .exe still open, old MSVS issue

Still get mkl_intel_thread.2.dll was not found.

 

Checking paths

...

Jim Dempsey

 

jimdempseyatthecove
Black Belt
1,119 Views

Your configuration was for Debug | Win32

I don't have 32-bit mkl installed.

Changing to 64-bit build works (at least at asking for data file name)

Entering ProbF.inp abends

examining code...

Jim Dempsey

jimdempseyatthecove
Black Belt
1,115 Views

Added:

            if(a1 .gt. N .or. b1 .gt. N  .or. a2 .gt. N .or. b2 .gt. N) then
             write(*,*) 'matrix overflow'
             write(*,*) "N=",N,a1,b1,a2,b2
            else
            sk(a1:b1,a2:b2) = sk(a1:b1,a2:b2) + asm(1:6,1:6,counter(h1))
            end if
See:
matrix overflow
 N=         156         127         132         157         162

a2 and b2 are out of range??

 

Jim Dempsey

jimdempseyatthecove
Black Belt
1,124 Views

John,

            count = 0
            count2 = 0
            do ina = 1, NA
                if(nodeT(ina) .gt. 0) then
                    count = count+1
                endif
            end do
            do ina = 1, NB
                if(nodeT(ina) .gt. 0) then
                    count2 = count2+1
                endif
            end do
            offset1 = NA - count
            offset2 = NB - count2

Why are count and count2 counting non-zero elements from the same array nodeT?

 

Note that the error condition occurs with a2:b2 indexing six columns (or rows depending on your view point) past the end of the sk array.

So, either the sk array is too small, or you are iterating one step too far.

 

I will await for your analysis.

 

Jim Dempsey

 

 

Reply