- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[cpp]#ifdef _USEWIN32LOCKAPI InterlockedDecrement(&refs_); #else #pragma omp atomic refs_--; #endif [/cpp]
My benchmarks have shown that
[cpp]InterlockedDecrement is much faster than using #pragma omp atomic
Why? I would think the compiler can generate inline code here?
Composer XE 2011, Windows 64.[/cpp]
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you please provide sample code that we can compile to review the issue?
Om
Om
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is my "sample" code. I am using Composer XE 2011 Update 3 64-bit compiler.
This is pretty simple.
If you do an assembly language listing, the omp code calls
__kmpc_global_thread_num and
__kmpc_atomic_fixed4_add
While the "Windows" code seems to doing it "inline" assembly
This is pretty simple.
If you do an assembly language listing, the omp code calls
__kmpc_global_thread_num and
__kmpc_atomic_fixed4_add
While the "Windows" code seems to doing it "inline" assembly
[cpp]#includeLONG refs_=0; void WINatomicAdd() { InterlockedIncrement(&refs_); } void OMPatomicAdd() { #pragma omp atomic ++refs_; } [/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It looks openmp atomic slower. Youmay use InterlockedIncrement.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page