Community
cancel
Showing results for 
Search instead for 
Did you mean: 
AndrewC
New Contributor I
74 Views

#pragma omp atomic vs InterlockedDecrement

[cpp]#ifdef _USEWIN32LOCKAPI
	InterlockedDecrement(&refs_);
#else
#pragma omp atomic
	refs_--;
#endif
[/cpp]

My benchmarks have shown that
[cpp]InterlockedDecrement is much faster than using #pragma omp atomic
Why? I would think the compiler can generate inline code here?
Composer XE 2011, Windows 64.[/cpp]
0 Kudos
3 Replies
Om_S_Intel
Employee
74 Views

Could you please provide sample code that we can compile to review the issue?

Om
AndrewC
New Contributor I
74 Views

This is my "sample" code. I am using Composer XE 2011 Update 3 64-bit compiler.

This is pretty simple.

If you do an assembly language listing, the omp code calls

__kmpc_global_thread_num and
__kmpc_atomic_fixed4_add

While the "Windows" code seems to doing it "inline" assembly

[cpp]#include 
LONG refs_=0;

void WINatomicAdd()
{

	InterlockedIncrement(&refs_);
}

void OMPatomicAdd()
{
#pragma omp atomic
  ++refs_;
}


[/cpp]






Om_S_Intel
Employee
74 Views

It looks openmp atomic slower. Youmay use InterlockedIncrement.
Reply