Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

Can I use optimize code including openmp directives ?

bugslayer
Beginner
326 Views

Hi all.

I've been trying to use OpenMP with ICPC (VS 2008).

There is no problem with no-optimization. But the program fails to run (runtime error) with "any" optimization setting.

Is it correct that no optimization option is used with OpenMP directives ?

My system contains Intel i7(Nehalem) 920 processor. And os is Windows 7 (x64)

(Program is also compiled in x64).

Thanks.

0 Kudos
4 Replies
Michael_K_Intel2
Employee
326 Views
version :

11.1 (054)

flags used to compile the code :

/c /O2 /I "C:\Program Files (x86)\Intel\Compiler\11.1\054\MKL\Include" /I "C:\Program Files (x86)\Intel\Compiler\11.1\054\MKL\Include\fftw" /I "C:\Program Files (x86)\Intel\Compiler\11.1\054\IPP\em64t\Include" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "NOMINMAX" /D "_VC80_UPGRADE=0x0710" /D "_MBCS" /EHsc /MD /GS /fp:fast /Fo"x64\release/" /W3 /nologo /Wp64 /Zi /Qopenmp

Thanks.

0 Kudos
bugslayer
Beginner
326 Views

This code works well with optimization option (Full-optimization)

[cpp]omp_set_num_threads(4);
int i,j,k;
#pragma omp parallel 
{			
	#pragma omp for private(i)
	for(i=1;i<=grid.m;i++){
		for(j=1;j<=grid.n;j++){
			for(k=1;k<=grid.mn;k++){
				TV_INT index(i,j,k);TV location=grid.Node(i,j,k);
				mls_phi_field(index)=mls.Get_Scalar(location);
			}
		}
	}
}[/cpp]

But this code fails to run normally.
[cpp]omp_set_num_threads(4);
#pragma omp parallel
{
	#pragma omp single
	{
		for(CELL_ITERATOR iterator(grid);iterator.Valid();iterator.Next()){
			#pragma omp task firstprivate(iterator)
			{
				TV location=iterator.Location();TV_INT index=iterator.Cell_Index();
				mls_phi_field(index)=mls.Get_Scalar(location);
			}	
		}
	}
}[/cpp]
Same problem occurs in my other machine (Yorkfield Q9650).
0 Kudos
bugslayer
Beginner
326 Views

I'm really sorry. I just try to "reply" to your post. But your post has been deleted !!! Why this happened ?

Anyway, My answer is ...

ICPC 11.1 (054)

/c /O2 /I "C:\Program Files (x86)\Intel\Compiler\11.1\054\MKL\Include" /I "C:\Program Files (x86)\Intel\Compiler\11.1\054\MKL\Include\fftw" /I "C:\Program Files (x86)\Intel\Compiler\11.1\054\IPP\em64t\Include" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "NOMINMAX" /D "_VC80_UPGRADE=0x0710" /D "_MBCS" /EHsc /MD /GS /fp:fast /Fo"x64\release/" /W3 /nologo /Wp64 /Zi /Qopenmp

/c /O2 /I "C:\Program Files (x86)\Intel\Compiler\11.1\054\MKL\Include" /I "C:\Program Files (x86)\Intel\Compiler\11.1\054\MKL\Include\fftw" /I "C:\Program Files (x86)\Intel\Compiler\11.1\054\IPP\em64t\Include" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "NOMINMAX" /D "_VC80_UPGRADE=0x0710" /D "_MBCS" /EHsc /MD /GS /fp:fast /Fo"x64\release/" /W3 /nologo /Wp64 /Zi /Qopenmp

I think the problem caused by "iterator" implementation like std container iterator.

Anything I keep in mind when writing the code ?

0 Kudos
Michael_K_Intel2
Employee
326 Views

Hi,

The first code fragment is wrong. The variables j and k are shared amongst all threads, which is propably not what you indented. You should add a private(j,k) to the parallel construct to ensure that each thread receives a private loop counter for the two inner-most loops.

In the second example, I would assume that you'd be better off in storing location and index not in task scope but in the scope before the task construct. Though I cannot look into the iterator, I would assume that the iterator contains some shared data that is not privatized by firstprivate.

Cheers

-michael

0 Kudos
Reply