Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Memory issue with OpenMP

Thomas_B_3
Beginner
898 Views
Hello,

I ran into a (probably) memory issue in an OpenMP parallelized code section. In the code I have an array of objects of class1 where each instance of class1 contains an array of objects of class2. They are declared as arrays with constant size in the declaration of the respective classes and not via new/delete. After increasing the size of one of the arrays, the program delivered wrong results. After decreasing the size, the program worked fine again. A second solution is to switch of the OpenMP support in the compiler options. Just commenting the OpenMP pragma is not sufficient. I tried kmp_set_stacksize_s(4000) and kmp_set_stacksize_s(16000) but now a different parallelized section even crashes.
System:
Xeon 5520
Win XP x64
Visual Studio 2008
Intel Compiler 11.0.066 (32bit!!)

On a different computer the same code compiles and runs perfectly.
Second system:
C2D 6400
Win XP 32bit
Visual Studio 2003
Intel Compiler 10.1.xyz

Has anybody an idea how to solve this issue?? Reducing the array size is only a temporal solution for me.

Thanks and best regards,
Tom
0 Kudos
15 Replies
aazue
New Contributor I
898 Views
Quoting - TJ04
Hello,

I ran into a (probably) memory issue in an OpenMP parallelized code section. In the code I have an array of objects of class1 where each instance of class1 contains an array of objects of class2. They are declared as arrays with constant size in the declaration of the respective classes and not via new/delete. After increasing the size of one of the arrays, the program delivered wrong results. After decreasing the size, the program worked fine again. A second solution is to switch of the OpenMP support in the compiler options. Just commenting the OpenMP pragma is not sufficient. I tried kmp_set_stacksize_s(4000) and kmp_set_stacksize_s(16000) but now a different parallelized section even crashes.
System:
Xeon 5520
Win XP x64
Visual Studio 2008
Intel Compiler 11.0.066 (32bit!!)

On a different computer the same code compiles and runs perfectly.
Second system:
C2D 6400
Win XP 32bit
Visual Studio 2003
Intel Compiler 10.1.xyz

Has anybody an idea how to solve this issue?? Reducing the array size is only a temporal solution for me.

Thanks and best regards,
Tom

Hi

On a different computer the same code compiles and runs perfectly

Difficult to answer where would be problem
before require test with old and new compiler on only one of two computer.

Before the first test.
Also can be an flag parameter for Xeon 5520 with new icc compiler not accorded.
Give your command complete, probably here some people better qualified O/S Microsoft that me
and have same your machine can confirm if correctly appropriated and maybe give better.
I think Opemp side is only an factor that favorise existing an problem already.
Can be also leak with last version compiler.
(augmented probability leaks new compiler that older)
(trace all values tables with increased by step )
Kind regards

0 Kudos
Om_S_Intel
Employee
898 Views

It would be nice if you could post a sample code that we can compile and investigate.
0 Kudos
TimP
Honored Contributor III
898 Views
Sorry, need a delete button.
0 Kudos
jimdempseyatthecove
Honored Contributor III
898 Views

Tom,

Does running with all the runtime checks enabled expose any errors?

When running on the Xeon system try forcing the number of threads to 2 (i.e. to thatused onyour C2D). If that works then you may have a coding error relating to temporal issues. This error is present in both environments, but yields incorrect results only when thread count >2.

Jim Dempsey
0 Kudos
Thomas_B_3
Beginner
898 Views
Hello,

I have good or bad news, depending on your point of view. After re-installation of the compiler I am unable to reproduce the problem, no matter what I try.

Om Sachan: I was "afraid" of that advice ;-). Of course you are right. But the code measures about 3M together with open source libraries which are used in the code section causing the problem. I was hoping that somebody encountered a similar behaviour of a program.

Jim: Basic runtime checks is set to default. I would have tried your recommendation (see above) regarding the number of threads. But in the code section delivering the wrong results only two threads are running.

Thank you all for your comments and suggestions.

Tom
0 Kudos
jimdempseyatthecove
Honored Contributor III
898 Views

On one of the other forum threads there was a problem report by a user trying to link in the static OpenMP libraries and linking in another library (e.g. MKL) that used the dynamic OpenMP libraries. Resulting in two instances of OpenMP within a single app. This does not work well. Was your problem related to this?

Jim
0 Kudos
aazue
New Contributor I
898 Views

It would be nice if you could post a sample code that we can compile and investigate.

Hi
I think not necessary, if program running well sized small must also run large.

Compiler have messages alert to give prevent same faults over size memory and others..

If reinstall have give now true result. is that you having an leak with installation before ..
leak are not aleatory if now true result no problem you can use binary operational without
problem for all times
.


Leak is word used easy for several type problems not solved ....
leak not show alert compiler and wrong point (stopped) observing can be hidden by an parallel process of OPENMP.
You have an wrong result and not always fault segmentation.

Probably your problem is here
They are declared as arrays with constant size in the declaration of the respective classes and not via new/delete.
After increasing the size of one of the arrays, the program delivered wrong results.

increase all arrays can be solve problem and confirm some part memory (constant size) not correctly reserved.

As better with Mpich2 you can trace answers return with sockets function at all point asynchronous.

First answer i have write that last version compiler increase probability leak,
is not that last compiler is wrong just now source must be wrote correctly, not tolerance for errors accepted before.
Is exactly same for Gcc.
Increasing reserved object memory over that you having really use can be used to solve problem leak. (forced move memory)
Leak probably the greater problem hard level , crazy sometimes have no relation with size program
For leak ,engineering quality control add council that you must never merge
older C style with new C++ language.
easy with speak but difficult in reality,also is no always true , with C++ only is one type, also leak sometimes.
Fault flag compiler unadapted processors can favoreise leaks ,is not only result are not improved or same
with or without.

Make an experience with not appropriated -march= -mtune= you discover easily problem.

One side that i have never really study is impact relation when an operating system with compiler 32
is working with processor(s) 64.
Probably, some difference comportment existing,compared 32 with 32 ???.
Some times difficult to understand or study all side that you use....
I think can be well some users have experience share information about problem leak process.

About duplicated lib crossed I don't know with Microsoft operating system ,but with Linux this
problem can not be occurred in reality (build process stopped before or can working without problem).

Kind regards

0 Kudos
Thomas_B_3
Beginner
898 Views
Jim: Are you referring to an article in the Intel knowledge base dated 07/07/2009? I did not get any OpenMP error messages. The program finished without any chrashes; some of the calculated results were just obviously wrong.

I also followed the recommendations in the MKL link advisor article.

Bustaf: Maybe one of us is mis-understanding a little bit: The sizes of the arrays were not changed dynamically by the program. I changed the size in the source code and re-compiled it. The creation of the arrays is located in the serial (non-threaded) section of the code.

Best regards, Tom

0 Kudos
TimP
Honored Contributor III
898 Views
I agree with Jim about OpenMP library problems remaining a prime suspect.
Unfortunately, there aren't necessarily clear error messages when multiple OpenMP libraries are linked. For example, when there are references to the VS library vcomp, in addition to linking the dynamic libiomp5, the vcomp linking must be suppressed:
-defaultlib:libiomp5md.lib
-nodefaultlib:vcomp.lib
-nodefaultlib:vcompd.lib
libiomp5 handles all vcomp calls correctly, if they aren't intercepted by vcomp. There are various ways in which a vcomp reference may creep in when there are objects built by MSVC, even without /openmp.
If you linked one of the libiomp or libguide libraries statically in a dll, then later linked again against one of those libraries, you should get a run-time warning, but I don't recommend you count on it. The safest way is to ensure that only a single libiomp5 dynamic library is linked anywhere. If you wish a static link, you must ensure that libiomp5 is linked only once, in the final link step.

0 Kudos
jimdempseyatthecove
Honored Contributor III
898 Views
Quoting - TJ04
Jim: Are you referring to an article in the Intel knowledge base dated 07/07/2009? I did not get any OpenMP error messages. The program finished without any chrashes; some of the calculated results were just obviously wrong.

I also followed the recommendations in the MKL link advisor article.

Bustaf: Maybe one of us is mis-understanding a little bit: The sizes of the arrays were not changed dynamically by the program. I changed the size in the source code and re-compiled it. The creation of the arrays is located in the serial (non-threaded) section of the code.

Best regards, Tom


No, there was a post where link times got very long - several minutes.

Jim
0 Kudos
jimdempseyatthecove
Honored Contributor III
898 Views

Tim, Tom

Tom can check the number of threads created after the 1st call to a parallel region, but hopefully before the 1st call to the suspected library. If OpenMP nesting is Off, then the number of threads after 1st parallel region but before 1st call to suspected library, should be the same as number of threads after 1st call to suspected library. If you see a bunch of threads created then it is likely you have a Static + Dynamic library problem. I do not know if it is possible to construct a situation where you have DLL V1.2.3.4 and DLL V1.4.3.2 (two different versions of DLL library). The thread count proceedure will help to identify the situation.

Note, Tom can add a parallel regionjust after mainthat simply prints out get_omp_thread_num(). After that region runs the default main OpenMP thread pool will have been established. As his program runs deeper into his code (and with Nested Levels disabled), if there is a change in thread count (less any threads explicitly created) then suspect multiple libraries. Knowing this, Tom can then try to sort out what is causing this to occur and then address/fix the problem.

Jim Dempsey
0 Kudos
aazue
New Contributor I
898 Views
Quoting - TJ04
Jim: Are you referring to an article in the Intel knowledge base dated 07/07/2009? I did not get any OpenMP error messages. The program finished without any chrashes; some of the calculated results were just obviously wrong.

I also followed the recommendations in the MKL link advisor article.

Bustaf: Maybe one of us is mis-understanding a little bit: The sizes of the arrays were not changed dynamically by the program. I changed the size in the source code and re-compiled it. The creation of the arrays is located in the serial (non-threaded) section of the code.

Best regards, Tom

Hi Tom
I have perfectly understand that you have operate change in source...
Two months ago
I have encountered exactly same your problem with function to an program required some (defined) arrays for ip adress IPV4
Used several arrays for different channel have not same size class address
I have wrote first speedily with all array declared same size and all working fine each process working individual thread (OPENMP) (Task largely improved)
After i want adjust all size arrays with appropriate potential element address different class (B & C) and program show wrong result.
The tracing elements array show sometimes blank or address added with (characters type extended ???) Two friends also programmer have modify function
different form and exactly same problem, require all array same size.. (Curioius Lib OPENMP work perfectly at greater other hardness functions in same program)
I have move program to an other computer exactly same (just processor some slower) and program running well with array different appropriated size.
One more time i must observe that my language control English have also some leaks or maybe lib literature conflict..
About lib ??? ( With Linux), if an lib problem , I think problem must persisting with secondary computer is exactly same configured.

I don't know , here just, 3 engineers qualified have confirmed as problem is an leak ???...
Kind regard

0 Kudos
jimdempseyatthecove
Honored Contributor III
898 Views

When you have problems that occur or hidden when varying number of threads and/or processor speed you may be observing problems related to

Shared variables being updated without atomic or critical sections.

Loops that reference array elements offset from loop control variable (e.g. Array[i-1] =... or Array[i+1] = ...). These statements may require specific sequencing of operations amongthreads and/or require atomic/critical section protection at the boundary of the array slice between threads.

First look at errors withing your program. Libraries have very small probability for error when compared with user programs that may have run without error (until revision made that exposed problem such as race condition).

Jim Dempsey
0 Kudos
Thomas_B_3
Beginner
898 Views
Jim: Can you recommend a small and easy-to-use software to see how many threads are running? Until now I am using the task manager to see the cpu load. The section of the code runs for at least a minute, so this rough tool provides a fairly good impression of howe much work is being done.

All libs are linked statically.

Tim: The two additional open source libraries are both compiled with the intel compiler on my computer. So vcomp from MSVC should not be involved. libiomp5 is indeed linked in one of these libraries as well. I will take care of it. But I am really suprised that the program was running without any problems for so many months.

Bustaf: I will modify the code to avoid mixing of C and C++.

Best regrads, Tom
0 Kudos
aazue
New Contributor I
898 Views

When you have problems that occur or hidden when varying number of threads and/or processor speed you may be observing problems related to

Shared variables being updated without atomic or critical sections.

Loops that reference array elements offset from loop control variable (e.g. Array[i-1] =... or Array[i+1] = ...). These statements may require specific sequencing of operations amongthreads and/or require atomic/critical section protection at the boundary of the array slice between threads.

First look at errors withing your program. Libraries have very small probability for error when compared with user programs that may have run without error (until revision made that exposed problem such as race condition).

Jim Dempsey

Hi

Thank Jim for your explications .
I have submit to my friends that have shared problem.
Program running now 2 month without problem
Tested before can be accepted also with fictive class (A) potential numbers +/- 17 millions )
about speed two machine processor very small difference just one of two have black label logo front side, also i think one of an friend have reversed the processor and memory two machine in phase test without result.
If is true ?, i have not see personally this change ...(curious also I have see ticket guaranty machine no cut)
I don't know curious problem , maybe can be probably as you have supposed ...
require specific sequencing of operations amongthreads and/or require atomic/critical section protection
at the boundary of the array slice between threads.


Kind regards all

Added:

Hi all
(I have submit to my friends that have shared problem.)
I have received answer....
Problem is now discovered .
First machine occurred problem have 1 MAC channel defective lan card fiber(RX-TX)
problem is now discover by freinds in other side task no relation (O/S,compiler or lib)....
Same experience show how can be difficult sometime to find where an problem.
All array same size can hide lan card hardware problem....

In addition these badgers, they make fun of me says it's my programing too
complicated killed the card .... I add time to invoice now , maybe, he laugh less ...

Sorry to all from my wrong relation,probably not similar problem exposed.

..<Kind regards .. >> NACK .?.?


0 Kudos
Reply