Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2464 Discussions

Macro '__TBB_compiler_fence' is 'nop-ed' if TBB is compiled with Intel C/C++ compiler

SergeyKostrov
Valued Contributor II
341 Views

Macro'__TBB_compiler_fence' is 'nop-ed' if TBB is compiled withIntel C/C++ compiler.

Header file: windows_ia32.h

...
#if __INTEL_COMPILER
#define __TBB_compiler_fence() __asm { __asm nop }
#elif _MSC_VER >= 1300
extern "C" void _ReadWriteBarrier();
#pragma intrinsic( _ReadWriteBarrier )
#define __TBB_compiler_fence() _ReadWriteBarrier()
#else
...

Could somebody explainwhyisthe macro'nop-ed'?

0 Kudos
1 Solution
Maxym_D_Intel
Employee
341 Views
Just have a look what actually usage of ReadWriteBarrier will produce at the code/asm level,
like NOP.

Quote:

It is important to understand that _ReadWriteBarrier does not insert any additional instructions, and it does not prevent the CPU from rearranging reads and writes-it only prevents the compiler from rearranging them.


http://msdn.microsoft.com/en-us/library/ee418650%28v=vs.85%29.aspx

View solution in original post

0 Kudos
5 Replies
Maxym_D_Intel
Employee
342 Views
Just have a look what actually usage of ReadWriteBarrier will produce at the code/asm level,
like NOP.

Quote:

It is important to understand that _ReadWriteBarrier does not insert any additional instructions, and it does not prevent the CPU from rearranging reads and writes-it only prevents the compiler from rearranging them.


http://msdn.microsoft.com/en-us/library/ee418650%28v=vs.85%29.aspx
0 Kudos
SergeyKostrov
Valued Contributor II
341 Views
...
It is important to understand that _ReadWriteBarrier does not insert any additional instructions
...


Interesting because my local version of MSDN installed with the Visual Studio 2005 doesn't have that
statement. I'll take a look at what codes will be compiled by Visual Studios 2005, 2008 and 2010 for a
simple Test-Case.

0 Kudos
SergeyKostrov
Valued Contributor II
341 Views
Just have a look what actually usage of ReadWriteBarrier will produce at the code/asm level,
like NOP.

[SergeyK] Yes, I did look.

Quote:

It is important to understand that _ReadWriteBarrier does not insert any additional instructions

[SergeyK] Yes, I confirm this. It rather "ignores" some instructions that could create an Access
Violation.

, and it does not prevent the CPU from rearranging reads and writes-it only prevents the compiler from rearranging them.

http://msdn.microsoft.com/en-us/library/ee418650%28v=vs.85%29.aspx


It looks like avery tricky and unreliable feature in terms of portability between different
platforms and C/C++ compilers. MS C/C++ compiler ( Visual Studio 2005 )has NOT generated a 'nop'
assembler instruction. Also, that feature is NOT available if ALL optimizations are disabled.

I wouldn't rely on '_ReadWriteBarrier' intrinsic function in case of a highly portable C/C++ library
because it doesn't force a software developer to fix a possible problem in a code that could create
an Access Violation, like:

...
RTint *pData = RTnull;
g_iVariable = *pData;
g_iVariable = 7;
...

I understand that software developers on the TBB project could have some different considerations
regarding the '_ReadWriteBarrier' intrinsic function.

Two Test-Cases are provided and take a look if you're interested:

Note: I tested with Visual Studio 2005

...
#include

#pragma intrinsic( _ReadWriteBarrier )

RTint g_iVariable = 0; // Must be Declared as global!
...

Test-Case 1 - _USE_READWRITEBARRIER is NOT defined

...
// #define _USE_READWRITEBARRIER

// Test-Case 1
{
RTint *pData = RTnull; // Instruction is NOT generated
g_iVariable = *pData; // Instruction is NOT generated
#if defined ( _USE_READWRITEBARRIER )
_ReadWriteBarrier();
#endif
g_iVariable = 7;
0040189F mov dword ptr [g_iVariable (5F29C4h)], 7 // Instruction is generated
}
...

Test-Case 2 - _USE_READWRITEBARRIER is defined

...
#define _USE_READWRITEBARRIER

// Test-Case 2
{
RTint *pData = RTnull;
0040189C xor eax, eax
g_iVariable = *pData;
0040189E mov eax, dword ptr [eax]// Unhandled exception error ( see below )
004018A0 add esp, 0Ch
004018A3 mov dword ptr [g_iVariable (5F29C4h)], eax
#if defined ( _USE_READWRITEBARRIER )
_ReadWriteBarrier();
#endif
g_iVariable = 7;
004018A8 mov dword ptr [g_iVariable (5F29C4h)], 7
}
...

Unhandled exception error:

Unhandled exception at 0x0040189e in ScaLibTestAppD.exe: 0xC0000005: Access violation
reading location 0x00000000.

Some of my C/C++ compiler command line options are as follows:

/O2 /GF /Gm /EHsc /MTd /openmp /W4 /nologo /c /Zi /TP /errorReport:prompt

Optimization for Speed

0 Kudos
RafSchietekat
Valued Contributor III
341 Views
This whole thread illustrates why internal features should not be accessed directly: other than not being supported across versions, they are also undocumented, requiring relevant knowledge and understanding (there is no issue here).

The example further illustrates why nonoptimised as well as optimised builds should always be tested.
0 Kudos
SergeyKostrov
Valued Contributor II
341 Views
Thank you guys for your feedback!
0 Kudos
Reply