Many years ago, with a lot of help from the good people on this site, one in particular, we got a structural analysis package running based on code from two good sources.
I was looking at a unique engineering problem in Europe that for the first time allows me to use the full range of the package, beams and plate elements in a close approximation to reality.
I have a test file that I am slowly working on to set up a sound model, I run the model as I add a few nodes so I can check the engineering data. Very easy to make a mistake and blasted hard to find if you try all at once.
I have been using IFORT, along with Pardiso and the Eigensolver. I get the following answers,
If I make a change to IFX I get the following answers,
The difference is not exactly 2, but about 1.9 +- 0.2 in the ratio.
When I compile with IFX I get
I realize IFX is a work in progress, I have no idea how to find out, what these errors mean in terms of line numbers, I still have a day of work to set up the final model,.
I am merely pointing out the problem.
I have a TB or so of measured data, so I will shortly know the real measurements.
Those are "remarks" not "errors" (it says so on the print out following the filename.
Apparently, your build configuration is specifying some diagnostic "check".
Not sure which one(s) it is(are). Bounds, temporary, ???
It would be nice to be sure or get a little bit better note.
This is a cantilevered slab from a river edge, designed to hang from an arch -- LOL what is wrong with flat boring concrete bridges, they work.
I am out to 28 metres and the results are closer.
Not sure what is causing the numerical differences between ifort and ifx.
You may need to insert some diagnostic prints to locate the source of the differences.
Use fpp (Fortran PreProcessor)
Insert PRINT/WRITE to trace progress to trace log (either redirect console output via Borr.exe>log.txt, or to your selected log file with WRITE)
Annotate log line with fpp directives __FILE__ and __LINE__ (include step as well as value you are tracing).
(you can omit the __FILE__ if you trace only from one file)
(you can omit __LINE__ if you only trace from one line in one file)
You can then use WinDiff.exe or your favorite difference program (I prefer Beyond Compare).
After you locate the point of divergence, you can use the debugger to break on the particular line and iteration of divergence. e.g.
IF(I == 1234) then ! I== loop control var, 1234 is your iteration number of interest
print *,"break here"
Note, if you make the 1234 a module (aka global) variable, then you can edit it for a different set point.
I happen to notice that the ifort build is running ...\borr\Debug\Borr.exe (32-bit build)
ifx is running ...\borr\x64\Debug\Borr.exe (64-bit build)
Also see if one of the builds (ifort) is generating code for the FPU as opposed to the AVXnnn
Jim et. al..
I want to run a model much larger than any I have run in this analysis package before this time, with combined plates and beams.
I was wasting a minute looking at IFX, I did not expect that big a difference. It is simply an observation, ie do not use IFX for anything except toys.
The blasted Pardiso is running into an overload with a matrix bigger than 144 by 144. I have sworn to all the gods. I need to get the model running as I am on a deadline and you can play with Pardiso for ages, so I am dragging out an old Conte and deBoor solver from 1975 and seeing if I can just slip it in for Pardiso for the moment.
It is just fun and thank god for one's friends.
the warning message is about -check options. Check those, IFX hasn't implemented all of the ifort -check options.
IFX vectorization is nowhere near that in ifort at this time. You can get closer to ifort by using /Qxhost on a genuine Intel processor.
The next update release will have no less than 479 major edits over the previous update to bring it closer to ifort in features, performance, and optimizations. IFX is a work in progress this year. with the v2023 release you'll see a much different IFX than you see today. Even with this next update you will see a step function in IFX features and performance.
Thank god I am not writing IFX, those guys must be going nuts.
Only an interesting human uses anything but an Intel Processor in my humble opinion.
You do realize that most of the major silicon foundries are close to and in Taiwan ie Japan, South Korea and Southern China, like all the major ones.
Who builds an empire on a bed of sand, quick sand?
PS The Germans can teach you to build mission-critical stuff in a deep bunker.
I have not seen the source code of the program and nothing that I have seen in this thread leads me to suspect that either Ifort or Ifx is responsible for the unacceptably large variations in the program output.
I suspect, on the basis of similar posts in this forum in connection with medium size old Fortran codes, that your current version of the program contains many common bugs such as
- Uninitialized variables
- Array bounds exceeded
- Mismatched subprogram arguments
- File accessed simultaneously with different unit numbers
- Allocatable arrays being used prior to allocation
- OCR errors
- Variables that the old program assumed to have the SAVE attribute not being given the attribute in the current code
Without having access to the source files, and a test case with known results for verifying the correctness of the program, I think that any conclusions regarding the Ifx compiler are premature and unfair. You could use some other compiler, say, Gfortran, and the results could be something else. I am not pleased to be so skeptical, but experience guides me to be cautious about prematurely drawn conclusions.
Until the program is known to work correctly, appealing to your engineering sense (regarding bridge vibrations, etc.) to interpret or explain the wrong results is a waste of time. Physics does not know about undefined quantities that may have any value.
Thank you for your comments.
I was not going to supply the code until I had an alternative inversion program running.
This is the structural analysis program from Harrison, that you helped me in 2015 to include Pardiso and the eigensolver. The code has been amended to add a lot of check features, same node numbers etc, but it is for all intents and purposes the program you made such significant corrections on. It works against Harrison's standard models. It works against the standard plate problems from the Colorado professor.
My thoughts are that there is a mistake in the program that needs to be found once I can solve a complete problem. At the moment, I cannot solve the complete problem as Pardiso stops at 144 nodes in a structural program. Again that is likely a mistake in the code, but I need the answers for the bridge without caring about the solver.
I also did not want to bother this group with this code until I had a good look at it, it is extensive.
I am slowly working through the code to look for all errors, there are always errors, the issue is finding them.
I was given a picture of a bridge, it is an unusual arch bridge with a hanging concrete slab. One uses ULARC to do a quick simple plastic analysis to determine if the approximate member sizes measured from the picture are close. ULARC gives you the deflections and the moments - ULARC is from UCB - Powell. ULARC is very fast to solve problems and gives you a safety factor, in this case, 2, which is ok. This is 2 hours of work.
One uses Borr, this program, to determine the deflections and the eigenvalues. This is 2 days' work. I ran into the problem that Pardiso is throwing a -2 error on the 144 by 144 matrix. I need a 200 by 200 matrix. It would be nice to take a few days and look at the problem, I do not have a few days, so it is simpler to slip in an old solver, get the answers for the eigenvalues and check ULARC and BORR against each other.
One then sets up a model in a difficult commercial program, Strand 7 which requires significant time and is difficult to amend with its GUI, this is 4 days of work. STRAND 7 will give me forces, deflections and eigenvalues. Strand 7 gives answers. How close to reality they are, is indeterminate, we know structural models are poor reflectors of real data. I have a lot of actual data I can then compare to the model, the point is to determine the element of the bridge that generates each eigenvalue. So at least two programs are used to check all elements of the model.
The difference of 2 in IFX and IFORT suggests I have an error in the code, I will find it, likely I will ask Jim and you kindly to look at it, but I do not want to waste your time now, so I did not supply the code. I am sure you will find errors, you always do and I appreciate the assistance. I have no doubt IFX will be a huge step forward, my minor issues will be resolved and we can check the safety of the bridge.
The programs are tested against a standard bridge in the central part of the USA with known eigenvalues, we have 3 years of not-so continuous data, and we are trying to find the rate of change of E with time, at the moment the statistics suggest it is one part per 100 million per second, the issue is not the seconds in the day, the issue is the seconds in 50 years.
There is only one place in the world to ask questions about Fortran code that will run in real-time on a NUC on a bridge to monitor the change, it is here. This is without doubt the preeminent group. I am so thankful you put up with me.
Again, thank you for your comment.
To provide a graph of time versus return for ULARC, BORR and STRAND7
On the ifort (32-bit) build...
... you are using the x87 (FPU) options: Qpc32, Qpc64, Qpc80 (QIfst, Qrcd, Qrct, ?rounding-mode:chopped)
... you will have instructed the compiler to generate x87 FPU code as opposed to SSE, AVX, AVX2 or AVX512 code
Note, my presumption is ifx generates only SSE, AVX, AVX2 or AVX512 code and thus would ignore (?warn) the x87 FPU related options.
The x87 FPU can be set to generate greater (or lesser) or same internal precision than the SIMD floating point system, but your deviations in results data appears to be much larger than these differences. Though high iteration counts could generate significant divergences in results data (when using different precisions).
Of particular note is to be cautious of square root result differences between x87 FPU and SIMD. The x87 FPU favors greater internal precision (80-bit FP/64-bit mantissa) trading off speed for accuracy, whereas the SIMD can generate much lesser precision (14, 23, or 28 bits) as well as having instructions that perform 1/sqrt(x) trading off accuracy for speed.
This said, I agree with mecej4 "nothing that I have seen in this thread leads me to suspect that either Ifort or Ifx is responsible for the unacceptably large variations in the program output"
RE: I have no doubt IFX will be a huge step forward
The main purpose of IFX is to be able to integrate GPU offloading as provided with the newer OpenMP capability.
This includes the Intel Arc GPU and/or Xe-HP, Xe-HPC, Xe-HPG
(grudgingly nVidia and others will be supported too).
aka Data Parallel Programming support via SYCL API
Note, NUC CPU's internal graphics processor is capable of supporting SYCL (but it has a limited number of vector processors and limited vector width). You could possibly get a few more "cores" worth of CPU performance but with a significant amount of work. If (when) you migrate to the Xe series or possibly Arc series on desktop/workstation it might be worth the programming effort.
From the Fortran on Intel or AMD CPU, ifort is a better choice today.
RE: at the moment the statistics suggest it is one part per 100 million per second, the issue is not the seconds in the day, the issue is the seconds in 50 years.
100m * 60 * 60 * 24 * 365 * 50 = 157,680,000,000,000,000 (~18 decimal digits)
58-bits of precision (not counting fractional bits)
Your larger period accumulators may need to be REAL(16) such as to not lose significant bits.
Double precision has 53 significant bits (which integral and fractional bits share) or 15.95 decimal digits
Quad precision (aka REAL(16)) has 113 significant bits or 34.02 decimal digits
I have absolutely no doubt that the error is my code, finding it is the interesting part, and asking questions is the first way to find out.
The first critical question is - are there any current problems with ifx that might cause the errors?
Based on the response the answer is no. So one moves on. But you always ask the null question, basic science.
We can track down the problems once I have a standard problem that can be solved, checked against standard programs and we can compare it to real data.
A theory without data is like Einstein's relativity theory until 1919 it was a theory and then some guy in Arizona I think collected the photo that proved it.
>>... 58-bits precision ... double precision 53 bits precision
at 50 years, using DP, a month's worth of data added to an accumulator would not attribute any change to the accumulator.
at 25 years, using DP, two weeks worth of data added to an accumulator would not attribute any change to the accumulator.
Also, try adding build option:
(arrays and scalars)
IOW you want the program to fail early (including integer variables). This should generate Infinity's for FP/DP and integers will overflow (to negative) or index way out of bounds. Both, hopefully aborting the program at the earliest point (where you can trap it in debug mode).
Initializing to zero might hide the error, snan has no effect on integers.
Thanks for the thoughts.
I enclose the borr.zip file.
This program has beam and plate modules. The beam module has been used a lot, the plate module has only been used a few times on very small test problems a long time ago.
There are two data files. Probe.inp is the largest model I can get that will run and give me some answers. I am not trusting these answers, it is merely a step in looking at the long-term development of a procedure, but the simplicity and quickness of this method means that the method is attractive in the long term.
ProbF.inp causes all sorts of errors to pop up.
The problem is a flat plate, 10 metres wide and 48 metres long, it is attached to a bank and sits over the river. At the moment, there are hangers that are assumed to be fixed in space to hold the slab. Each strip of the bridge is broken into 3 triangles.
ie long arch, flat slab below on steel wires.
ProbF interesting has shown an error in assembling the global stiffness matrix, this is likely a counting problem and fixable.
write(*,*)a1,b1,a2,b2 if(a1 .gt. N .or. b1 .gt. N .or. a2 .gt. N .or. b2 .gt. N) then write(*,*) 'matrix overflow' else sk(a1:b1,a2:b2) = sk(a1:b1,a2:b2) + asm(1:6,1:6,counter(h1)) end if
In saveSM this traps the error, although I do not stop, just do not add the local matrix.
But probF causes FEAST in doing an analysis to report a Pardiso error.
I use Pardiso to do the matrix inversion, it appears from reading this morning that Feast uses Pardiso as well.
Pardiso can handle problems bigger than 150 by 150, so I am at a loss as to the cause. The program has a lot of Fortran-write statements.
On my NUC, with oneAPI Fortran 2022.1.0.256
I get mkl_intel_thread.2.dll was not found.
This is running your built Borr.exe
Must have bunged up path?
If I try building, I get Error: The operation could not be completed.
Exiting MSVS then restarting...
Build succeeded, must have been 1st/bad run .exe still open, old MSVS issue
Still get mkl_intel_thread.2.dll was not found.
if(a1 .gt. N .or. b1 .gt. N .or. a2 .gt. N .or. b2 .gt. N) then write(*,*) 'matrix overflow' write(*,*) "N=",N,a1,b1,a2,b2 else sk(a1:b1,a2:b2) = sk(a1:b1,a2:b2) + asm(1:6,1:6,counter(h1)) end if See: matrix overflow N= 156 127 132 157 162
a2 and b2 are out of range??
count = 0 count2 = 0 do ina = 1, NA if(nodeT(ina) .gt. 0) then count = count+1 endif end do do ina = 1, NB if(nodeT(ina) .gt. 0) then count2 = count2+1 endif end do offset1 = NA - count offset2 = NB - count2
Why are count and count2 counting non-zero elements from the same array nodeT?
Note that the error condition occurs with a2:b2 indexing six columns (or rows depending on your view point) past the end of the sk array.
So, either the sk array is too small, or you are iterating one step too far.
I will await for your analysis.