Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

#pragma omp atomic vs InterlockedDecrement

AndrewC
New Contributor III
438 Views
[cpp]#ifdef _USEWIN32LOCKAPI
	InterlockedDecrement(&refs_);
#else
#pragma omp atomic
	refs_--;
#endif
[/cpp]

My benchmarks have shown that
[cpp]InterlockedDecrement is much faster than using #pragma omp atomic
Why? I would think the compiler can generate inline code here?
Composer XE 2011, Windows 64.[/cpp]
0 Kudos
3 Replies
Om_S_Intel
Employee
438 Views
Could you please provide sample code that we can compile to review the issue?

Om
0 Kudos
AndrewC
New Contributor III
438 Views
This is my "sample" code. I am using Composer XE 2011 Update 3 64-bit compiler.

This is pretty simple.

If you do an assembly language listing, the omp code calls

__kmpc_global_thread_num and
__kmpc_atomic_fixed4_add

While the "Windows" code seems to doing it "inline" assembly

[cpp]#include 
LONG refs_=0;

void WINatomicAdd()
{

	InterlockedIncrement(&refs_);
}

void OMPatomicAdd()
{
#pragma omp atomic
  ++refs_;
}


[/cpp]






0 Kudos
Om_S_Intel
Employee
438 Views
It looks openmp atomic slower. Youmay use InterlockedIncrement.
0 Kudos
Reply