Software Archive
Read-only legacy content
17061 Discussions

Using openmp in _Cilk_offload

王_子_
Beginner
407 Views

Recently, I want to port a complex program based on cpu to MIC. Because of the complex struct ,so I use the _Cilk_shared to manager the pointer to complex struct. I also make it successfully running on mic. Only use one core ,so it's performance not good and I try to using openmp [pragma omp parallel for] to parallel the for iteration. But the performance not became better and I print the info show that the program only use one core to run. Even after I annotate some functions. The test code showed below:

_Cilk_shared void offloadfunction(worker_t w,bwt_t *bwtmic,uint8_t *pacmic ,int n)
{
	int i=0;
// some complex data transport 
	w.bwt=bwtmic;
	w.pac=pacmic;
    struct timeval tv1,tv2;
    struct timezone tz;
    gettimeofday (&tv1, &tz);

#pragma omp parallel for num_threads(200)
	for (i = 0; i < 1000000000; i++) {
		int j = 10;
		j = j * 10;
		if(i%100000==0)
		printf("\n %d %d %d \n",omp_get_num_procs(),omp_get_num_threads(),omp_get_thread_num());
	}

    gettimeofday (&tv2, &tz);
    float t=(tv2.tv_sec-tv1.tv_sec)+ (tv2.tv_usec-tv1.tv_usec)/1000000; 


}

The output is :

 236 1 0 

 236 1 0 

 236 1 0 

 236 1 0 

 236 1 0 

 236 1 0 

This offload program just used one core on mic . What can I do for it ? 

0 Kudos
3 Replies
Rob_J_
Beginner
407 Views

I wouldn't expend a lot of effort on Cilk.

 

https://software.intel.com/en-us/forums/intel-cilk-plus/topic/745556

 

0 Kudos
王_子_
Beginner
407 Views

Rob J. wrote:

I wouldn't expend a lot of effort on Cilk.

 

https://software.intel.com/en-us/forums/intel-cilk-plus/topic/745556

 

 

Thank you very much , I find my mistake today. 

Because of the code is C file, I only add -qopenmp on Intel C++ Compile in Eclipse. So after I add -qopenmp on Intel C Compile , it is ok .  

0 Kudos
TimP
Honored Contributor III
407 Views

In general, in case you do only a moderate amount of work in the OpenMP section, you will see a KMP_BLOCKTIME delay before OpenMP releases threads back to Cilk. It can be moderated under hyperthreads (and on MIC) by limiting both OpenMP and Cilk to 1 thread or worker per core.  Intel never recommended mixing cilkplus with OpenMP; as others mentioned, Cilk support has gone away.  The inefficiency of Cilk on MIC might have contributed to the decision to drop support.

printf() should be serialized anyway, but the critical is reasonable, recognizing that region will take a significant time interval.

0 Kudos
Reply