Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
공지
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
29281 토론

Microsoft Visual Studio 2010 & Intel® Visual Fortran exe runs slower than Linux ifortran exe does

Bakhbergen
초보자
14,384 조회수

One of my former colleagues gave me exe file and Fortran source code for it in 2012. According to the Make file included with his project, the colleague created the exe file on Linux ifortran.

As a Windows user, recently I created exe file for the same code using Microsoft Visual Studio 2010 and Intel® Visual Fortran 2013. After fixing a few errors and warnings in Debug mode, I have produced Release (Win32) mode executable. I have found that my exe runs significantly slower than my colleague's one does.

I have lost contact with the colleague. Does anyone know what may be causing this? I would appreciate any help.

0 포인트
41 응답
Bakhbergen
초보자
6,247 조회수

All, please find attached Release BuildLog.htm files for both projects. I hope you find them informative.

0 포인트
mecej4
명예로운 기여자 III
6,224 조회수

Both build logs contain this alert, which may merit your attention:

...\subroutine_019.f(768): warning #6371: A jump into a block from outside the block may have occurred. [150]
IF (CUMPV.LT.WHPV) GO TO 150

 

 

0 포인트
Bakhbergen
초보자
6,211 조회수

mecej4, I know. I have to go further with this warning for a while.

0 포인트
JohnNichols
소중한 기여자 III
6,140 조회수

...\subroutine_019.f(768): warning #6371: A jump into a block from outside the block may have occurred. [150]

You can get this for a error call on a read statement if it jumps to the error code line.  These are challenges to eliminate some time.  

0 포인트
Bakhbergen
초보자
6,247 조회수

My apologies for the duplicate post.

0 포인트
JohnNichols
소중한 기여자 III
6,225 조회수

You are chasing a furphy inside a zephyr.  I run a single program on multiple computers across the world. We record the loop time of the run time, in each loop, a loop is about 8 seconds, the loop time is observed to be a non-Gaussian distribution, which is often the problem with data generated by computers. I will explain if required, we have probably 10 million records, if I normalize my results and your results then you are within what I would call expected limits. 

Computers do a lot of things you do not see and that affects run time.  Two runs one after the other can vary a lot. 

You are wasting your time, get a better computer and a better compiler and a modern Version of Windows, say version 20211 and you will get better results. 

Or modern Linux, although we are having trouble with Linux and ethernet issues, you will have the same issues, we see this on NUCs PI,s and DELLS.   

JMN

0 포인트
Bakhbergen
초보자
6,206 조회수

JohnNichols, thank you for your opinion.  The problem is that I am getting about one and half times slower performance with a newer compiler, newer version of Microsoft Visual Studio, etc. I am OK with the overall computation time ratio of 2 minutes vs 3 minutes of CPU time. But how about 2 days vs 3 days?

0 포인트
JohnNichols
소중한 기여자 III
6,139 조회수

Timing can be critical at 8 seconds, I suggest some write statements and see the hell where the slow down is -- write statements to a log file. 

0 포인트
Bernard
소중한 기여자 I
6,155 조회수

@JohnNichols wrote:

You are chasing a furphy inside a zephyr.  I run a single program on multiple computers across the world. We record the loop time of the run time, in each loop, a loop is about 8 seconds, the loop time is observed to be a non-Gaussian distribution, which is often the problem with data generated by computers. I will explain if required, we have probably 10 million records, if I normalize my results and your results then you are within what I would call expected limits. 

Computers do a lot of things you do not see and that affects run time.  Two runs one after the other can vary a lot. 

You are wasting your time, get a better computer and a better compiler and a modern Version of Windows, say version 20211 and you will get better results. 

Or modern Linux, although we are having trouble with Linux and ethernet issues, you will have the same issues, we see this on NUCs PI,s and DELLS.   

JMN


Unfortunately (from the performance analyst perspective)  that is true. On daily base I measure the performance of our L1-PHY (5G physical (upper) layer) simulation and I saw a huge variations of the same test module results gathered by VTune (perf collector, and sep5.ko collector).  The distributions are not-normal and usually for 100 runs are of muli-modal type.

The main contributor of those huge variations is a performance measurement process itself, the other factors are mainly OS-kernel generated (context switching, thread migration, periodic apic timer activity, interrupt handling) and HW-oriented (voltage ramp up, frequency throttling, and other thermal events).


 

 

0 포인트
JohnNichols
소중한 기여자 III
6,134 조회수

I use two NUC's one with a core i3 - 6100 and one with a core i3 - 7100. 

I have them installed in identical situations running identical code, every 8 seconds - never to stop 

The 7100 has run perfectly for years, the 6100 hangs the programs at odd intervals from days to weeks.  Even slight differences can show up problems.  As I try and solve the conflict that stuffs the 6100 but not the 7100. 

If I could replace the 6100 I would, but it means a 3 day drive. 

Good hunting with your problem - I understand the frustration. 

0 포인트
Bakhbergen
초보자
6,126 조회수

Thank you everyone for contributing to the discussion on this topic. Your answers and comments are very informative and helpful.

0 포인트
Bernard
소중한 기여자 I
6,104 조회수

The 7100 has run perfectly for years, the 6100 hangs the programs at odd intervals from days to weeks.

What type of hang is it? Does it require the reboot sequence or is it a  process (your exe container) hang?

0 포인트
JohnNichols
소중한 기여자 III
6,089 조회수

The exe container is a watcher program - the main program is multithreaded - connected to a mysql database in the cloud and reading data from a source - it crashes from time to time on some machines, which is the reason for the standard watcher program.  The problem with the 6100 is the watcher does not restart the main program from time to time, but now I have a machine that is doing this about every 24 hours, so I can start to look at the issue, before it might last a month, very hard to debug a monthly problem. 

The real problem is the closer you get to NASA TRL 9 the fewer mistakes you are allowed and really if it makes a mistake it has to be self correcting.  The 7100 has run for years the only thing that stops that is a power outage and it can recover from that -  it does exactly the same thing as the 6100 and the set up is identical - I only use SAMSUNG SSD's, I only use NUC's -- 

I checked the temperatures, they are all within normal - but I cannot replace the 6100 without a 3 day drive and man I do not want to do that -- 

 

0 포인트
Bernard
소중한 기여자 I
6,080 조회수

the main program is multithreaded - connected to a mysql database in the cloud and reading data from a source - it crashes from time to time on some machines, which is the reason for the standard watcher program.

So this is a main program (process) hang. You can configure (I presume you are using Windows) the OS to collect the minidump file of the failed process and try to analyze the root cause in windbg (of course it may be very hard to find the culprit), but at least there will be some possibility to investigate little bit in-depth.

0 포인트
JohnNichols
소중한 기여자 III
6,064 조회수

No the main process stops completely  - but the watcher in this instance does not restart it -- I have got it on a debug loop 

0 포인트
jimdempseyatthecove
명예로운 기여자 III
6,040 조회수

>>the main process stops completely - but the watcher in this instance does not restart it

Apparently your watcher needs debugging. Can you perform an attach to process from the debugger?

I've experienced instances in a multi-threaded program where the number of threads are oversubscribed .AND. where one or more threads are dependent upon a different thread for enabling progressing. The programmer in this case (me) took proactive programming to address this issue by inserting thread yield calls in the wait-for-other-thread-to-complete. The problem with this, as I discovered after lengthy investigations, is the thread yield (on Windows) appears to yield to threads that were preempted and not to threads that may have been pending on I/O. IOW the (formerly) pending on I/O threads would not resume under a condition that a full subscription of threads were spinning in thread yield loops.

The corrective measure was to use Sleep/SleepQQ for 0ms.

Jim Dempsey

0 포인트
JohnNichols
소중한 기여자 III
6,018 조회수

I started the watcher program inside VS in debug mode and the blasted thing has run for 2 days -- I wonder why this helps?

 

0 포인트
jimdempseyatthecove
명예로운 기여자 III
6,006 조회수

Start the watcher program normally (i.e. .NOT. from VS)

Upon hang, then start MS VS | Debug | Attach to Process | Pick Watcher Process | Break

You may need to compile the watcher with Full Debugging (IOW Release build with debugging)
.AND. have the Linker .NOT. remove debug symbols.

Doing the above, you have the watcher executable .AND. runtime environment as-was the hanging environment.

Jim Dempsey

0 포인트
JohnNichols
소중한 기여자 III
5,810 조회수

Jim:

Thanks, I has used the attach process, but it did not give me anything, I will make your modifications. Teh NUCS are great computers, Intel scored a home run there. 

JMN

0 포인트
Bernard
소중한 기여자 I
5,799 조회수

In addition to Jim's advise, you may also experiment with the windbg which is more powerful than VS debugger and has a lot of meta-commands builtin for your debugging convenience.

In your specific case you may use windbg and issue a  !runaway 3 command (rather extension) in order to detect the kernel and user mode highest cycle consumers.

Here is the detailed description

https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/tracking-down-a-processor-hog

 

0 포인트
Bernard
소중한 기여자 I
6,167 조회수

You may try to asses the performance delta between those two executable by using Intel VTune profiler. The hotspots analysis shall suffice for now. It is very hard to know the real root cause of aforementioned performance delta just by looking at some absolute time indication. As @JohnNichols  said the distribution might be and usually is not-normal and rather (as I measured is either log-normal or multimodal), so I would suggest to run at least 10 (be aware it is not enough!)  profiling sessions for each executable and  analyze the results.

 

0 포인트
응답