- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
*** Latency and Throughput of Intel CPUs 'clflush' instruction ***
Link Copied
38 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Abstract ]
Latency and Throughput of Intel CPUs clflush instruction.
Introduced with SSE2 IRT-Domain and is an instruction with a speculative execution. It is a real challenge to measure clflush instruction latency because it is up to a CPU when to actually execute it.
IRT-Domain - SSE2 - [ emmintrin.h ]
...
extern void __ICL_INTRINCC _mm_clflush( void const *p );
...
IRT - Intrinsics Run-Time
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Here are notes related to objectives of an investigation ( a small R&D work ) ]
1. Intel does Not provide any numbers for the latency of CLFLUSH instruction.
2. Discussions about the latency of CLFLUSH instruction are highly speculative because it is
Not clear when the instruction is actually executed.
3. Some discussions about the latency of CLFLUSH instruction do Not take into account that
it flushes data into the main memory ( RAM ) and its latency is usually known. It is Not
clear when a cache line really becomes available for another hardware or software prefetch of
data or a set of instructions, and if it becomes available before (!) the main memory is
updated with a modified data.
4. It is more important to understand how as effective as possible binary codes could be
generated by C++ compilers in order to achieve the highest throughput of a set of CLFLUSH
instructions.
5. It is shown later that ineffective binary codes generation by a C++ compiler could affect
throughput of a set of CLFLUSH instructions.
6. Three types of binary code generations are possible and they are as follows:
- Type-1: Based on 'clflush [ebp-offset]' instruction using a general purpose register 'ebp'
- Type-2: Based on 'clflush [eXx]' instruction using a general purpose register 'eXx'
- Type-3: Composite when 'clflush' instruction is generated in a small Not inline function
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Intel CLFLUSH instruction Opcodes ]
0F AE 38................clflush [eax]
0F AE 3B................clflush [ebx]
0F AE 39................clflush [ecx]
0F AE 3A................clflush [edx]
0F AE BD [offset]....clflush [ebp-offset]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Test Case - IrtClflush & CrtClflush ]
...
RTint piAddress[10][16] =
{
{ 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11 }, // 0
{ 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22 }, // 1
{ 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33 }, // 2
{ 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44 }, // 3
{ 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77 }, // 4
{ 0x88, 0x88, 0x88, 0x88, 0x88, 0x88, 0x88, 0x88, 0x88, 0x88, 0x88, 0x88, 0x88, 0x88, 0x88, 0x88 }, // 5
{ 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44 }, // 6
{ 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33 }, // 7
{ 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22 }, // 8
{ 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11 }, // 9
};
IrtClflush( &piAddress[0][0] );
CrtClflush( &piAddress[1][0] );
CrtSetThreadPriority( THREADPRIORITY_REALTIME );
CrtPrefetchData( ( RTchar * )&piAddress[0][0] ); // All prefetches are T0-type
CrtPrefetchData( ( RTchar * )&piAddress[1][0] );
CrtPrefetchData( ( RTchar * )&piAddress[2][0] );
CrtPrefetchData( ( RTchar * )&piAddress[3][0] );
CrtPrefetchData( ( RTchar * )&piAddress[4][0] );
CrtPrefetchData( ( RTchar * )&piAddress[5][0] );
CrtPrefetchData( ( RTchar * )&piAddress[6][0] );
CrtPrefetchData( ( RTchar * )&piAddress[7][0] );
CrtPrefetchData( ( RTchar * )&piAddress[8][0] );
CrtPrefetchData( ( RTchar * )&piAddress[9][0] );
RTuint64 uiClock1 = CrtRdtsc();
CrtClflush( &piAddress[0][0] );
CrtClflush( &piAddress[1][0] );
CrtClflush( &piAddress[2][0] );
CrtClflush( &piAddress[3][0] );
CrtClflush( &piAddress[4][0] );
CrtClflush( &piAddress[5][0] );
CrtClflush( &piAddress[6][0] );
CrtClflush( &piAddress[7][0] );
CrtClflush( &piAddress[8][0] );
CrtClflush( &piAddress[9][0] );
RTuint64 uiClock2 = CrtRdtsc();
CrtPrintf( RTU("[ CrtClflush ] - Executed in %u clock cycles\n"),
( RTuint )( uiClock2 - uiClock1 ) / 10 );
CrtSetThreadPriority( THREADPRIORITY_NORMAL );
CrtPrintf( RTU("IrtClflush & CrtClflush\n") );
...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Watcom C++ compiler - Generated binary codes - No re-ordering of instructions ]
...
00403737 lea eax, [ebp-8AEh]
0040373D prefetcht0 [eax]
00403740 lea eax, [ebp-86Eh]
00403746 prefetcht0 [eax]
00403749 lea eax, [ebp-82Eh]
0040374F prefetcht0 [eax]
00403752 lea eax, [ebp-7EEh]
00403758 prefetcht0 [eax]
0040375B lea eax, [ebp-7AEh]
00403761 prefetcht0 [eax]
00403764 lea eax, [ebp-76Eh]
0040376A prefetcht0 [eax]
0040376D lea eax, [ebp-72Eh]
00403773 prefetcht0 [eax]
00403776 lea eax, [ebp-6EEh]
0040377C prefetcht0 [eax]
0040377F lea eax, [ebp-6AEh]
00403785 prefetcht0 [eax]
00403788 lea eax, [ebp-66Eh]
0040378E prefetcht0 [eax]
00403791 rdtsc
00403793 mov ecx, eax
00403795 lea eax, [ebp-8AEh]
0040379B clflush [eax]
0040379E lea eax, [ebp-86Eh]
004037A4 clflush [eax]
004037A7 lea eax, [ebp-82Eh]
004037AD clflush [eax]
004037B0 lea eax, [ebp-7EEh]
004037B6 clflush [eax]
004037B9 lea eax, [ebp-7AEh]
004037BF clflush [eax]
004037C2 lea eax, [ebp-76Eh]
004037C8 clflush [eax]
004037CB lea eax, [ebp-72Eh]
004037D1 clflush [eax]
004037D4 lea eax, [ebp-6EEh]
004037DA clflush [eax]
004037DD lea eax, [ebp-6AEh]
004037E3 clflush [eax]
004037E6 lea eax, [ebp-66Eh]
004037EC clflush [eax]
004037EF rdtsc
004037F1 xor edx, edx
004037F3 sub eax, ecx
...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ C++ compilers generated binary codes - Short Summary ]
[ Microsoft C++ compiler ]
A - optimized
...
clflush [ebp-100h]
...
B - non-optimized
...
mov eax, dword ptr [ebp+8]
clflush [eax]
...
[ Borland C++ compiler ]
A - optimized
...
mov edx, dword ptr [ebp-3D0h]
clflush [edx]
...
B - non-optimized ( in a small Not inline function )
...
push ebp
mov ebp, esp
mov eax, dword ptr [ebp+8]
clflush [eax]
pop ebp
ret
...
[ Intel C++ compiler ]
A - optimized
...
clflush [ebp-638h]
...
B - non-optimized
N/A
[ MinGW C++ compiler ]
A - optimized
...
mov edx, dword ptr [ebp-338h]
clflush [edx]
...
B - non-optimized
N/A
[ Watcom C++ compiler ]
A - optimized
...
mov eax, dword ptr [ebp-194h]
clflush [eax]
...
B - non-optimized
N/A
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Run-Time testing - Extended Tracing - No ]
[ Microsoft C++ compiler ]
...
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
...
Here are generated binary codes:
...
00244486 rdtsc
00244488 clflush [ebp-300h]
0024448F clflush [ebp-240h]
00244496 clflush [ebp-180h]
0024449D mov dword ptr [ebp-48h], eax
002444A0 clflush [ebp-340h]
002444A7 clflush [ebp-280h]
002444AE clflush [ebp-1C0h]
002444B5 clflush [ebp-100h]
002444BC mov dword ptr [ebp-44h], edx
002444BF clflush [ebp-2C0h]
002444C6 clflush [ebp-200h]
002444CD clflush [ebp-140h]
002444D4 rdtsc
...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Run-Time testing - Extended Tracing - No ]
[ Borland C++ compiler ]
...
[ CrtClflush ] - Executed in 96 clock cycles
[ CrtClflush ] - Executed in 91 clock cycles
[ CrtClflush ] - Executed in 93 clock cycles
[ CrtClflush ] - Executed in 96 clock cycles
[ CrtClflush ] - Executed in 96 clock cycles
[ CrtClflush ] - Executed in 96 clock cycles
[ CrtClflush ] - Executed in 90 clock cycles
[ CrtClflush ] - Executed in 91 clock cycles
[ CrtClflush ] - Executed in 94 clock cycles
[ CrtClflush ] - Executed in 84 clock cycles
...
Here are generated binary codes:
...
0040417A call CrtRdtsc (406D6Ch)
0040417F mov dword ptr [ebp-230h], eax
00404185 mov dword ptr [ebp-22Ch], edx
0040418B lea ecx, [ebp-0BD0h]
00404191 push ecx
00404192 call CrtClflush (40123Ch)
00404197 pop ecx
00404198 lea eax, [ebp-0B90h]
0040419E push eax
0040419F call CrtClflush (40123Ch)
004041A4 pop ecx
004041A5 lea edx, [ebp-0B50h]
004041AB push edx
004041AC call CrtClflush (40123Ch)
004041B1 pop ecx
004041B2 lea ecx, [ebp-0B10h]
004041B8 push ecx
004041B9 call CrtClflush (40123Ch)
004041BE pop ecx
004041BF lea eax, [ebp-0AD0h]
004041C5 push eax
004041C6 call CrtClflush (40123Ch)
004041CB pop ecx
004041CC lea edx, [ebp-0A90h]
004041D2 push edx
004041D3 call CrtClflush (40123Ch)
004041D8 pop ecx
004041D9 lea ecx, [ebp-0A50h]
004041DF push ecx
004041E0 call CrtClflush (40123Ch)
004041E5 pop ecx
004041E6 lea eax, [ebp-0A10h]
004041EC push eax
004041ED call CrtClflush (40123Ch)
004041F2 pop ecx
004041F3 lea edx, [ebp-9D0h]
004041F9 push edx
004041FA call CrtClflush (40123Ch)
004041FF pop ecx
00404200 lea ecx, [ebp-990h]
00404206 push ecx
00404207 call CrtClflush (40123Ch)
0040420C pop ecx
0040420D call CrtRdtsc (406D6Ch)
00404212 mov dword ptr [ebp-238h], eax
00404218 mov dword ptr [ebp-234h], edx
...
...
// CrtRdtsc (406D6Ch)
00406D6C rdtsc
00406D6E ret
...
...
// CrtClflush (40123Ch)
0040123C push ebp
0040123D mov ebp, esp
0040123F mov eax, dword ptr [ebp+8]
00401242 clflush [eax]
00401245 pop ebp
00401246 ret
...
Note: This is the worst case and related to how CLFLUSH and RDTSC instructions are implemented in software.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Run-Time testing - Extended Tracing - No ]
[ Intel C++ compiler ]
...
[ CrtClflush ] - Executed in 20 clock cycles
[ CrtClflush ] - Executed in 23 clock cycles
[ CrtClflush ] - Executed in 24 clock cycles
[ CrtClflush ] - Executed in 24 clock cycles
[ CrtClflush ] - Executed in 20 clock cycles
[ CrtClflush ] - Executed in 19 clock cycles
[ CrtClflush ] - Executed in 19 clock cycles
[ CrtClflush ] - Executed in 22 clock cycles
[ CrtClflush ] - Executed in 19 clock cycles
[ CrtClflush ] - Executed in 18 clock cycles
...
A question is why does it slower than Microsoft or Watcom C++ compilers?
Here are generated binary codes:
...
0040365C rdtsc
0040365E clflush [ebp-8B8h]
00403665 mov ecx, eax
00403667 clflush [ebp-878h]
0040366E clflush [ebp-838h]
00403675 clflush [ebp-7F8h]
0040367C clflush [ebp-7B8h]
00403683 clflush [ebp-778h]
0040368A clflush [ebp-738h]
00403691 clflush [ebp-6F8h]
00403698 clflush [ebp-6B8h]
0040369F clflush [ebp-678h]
004036A6 rdtsc
...
1. Intel C++ compiler re-ordered a sequence of instructions.
2. 'mov ecx, eax' is placed after the 1st 'clflush [ebp-8B8h]' in order to save a value returned from 'RDTSC' in 'eax' general purpose register.
3. It is possible that pipelining is affected ( Very Likely! ), or an instruction stall is created ( Not proven and speculative! ).
4. Take a look at a perfectly generated binary codes by Watcom C++ compiler ( see Post #6 ).
5. Almost the same re-ordering is done by Microsoft C++ compiler.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Run-Time testing - Extended Tracing - No ]
[ MinGW C++ compiler ]
...
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
...
Here are generated binary codes:
...
0040265B rdtsc
0040265D mov esi, eax
0040265F clflush [ebp-2B8h]
00402666 clflush [ebp-278h]
0040266D clflush [ebp-238h]
00402674 clflush [ebp-1F8h]
0040267B clflush [ebp-1B8h]
00402682 clflush [ebp-178h]
00402689 clflush [ebp-138h]
00402690 clflush [ebp-0F8h]
00402697 clflush [ebp-0B8h]
0040269E clflush [ebp-78h]
004026A2 rdtsc
...
Perfect binary codes generation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Run-Time testing - Extended Tracing - No ]
[ Watcom C++ compiler ]
...
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
[ CrtClflush ] - Executed in 12 clock cycles
...
Here are generated binary codes:
...
00403791 rdtsc
00403793 mov ecx, eax
00403795 lea eax, [ebp-8AEh]
0040379B clflush [eax]
0040379E lea eax, [ebp-86Eh]
004037A4 clflush [eax]
004037A7 lea eax, [ebp-82Eh]
004037AD clflush [eax]
004037B0 lea eax, [ebp-7EEh]
004037B6 clflush [eax]
004037B9 lea eax, [ebp-7AEh]
004037BF clflush [eax]
004037C2 lea eax, [ebp-76Eh]
004037C8 clflush [eax]
004037CB lea eax, [ebp-72Eh]
004037D1 clflush [eax]
004037D4 lea eax, [ebp-6EEh]
004037DA clflush [eax]
004037DD lea eax, [ebp-6AEh]
004037E3 clflush [eax]
004037E6 lea eax, [ebp-66Eh]
004037EC clflush [eax]
004037EF rdtsc
...
Perfect binary codes generation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Run-Time testing - Extended Tracing - No - Summary ]
Let's consider three cases for Intel CPUs 'clflush' instruction:
1. Perfect binary codes generation to achieve the highest throughput:
MinGW C++ compiler ( rating is 10 out of 10 )
Watcom C++ compiler ( rating is 9 out of 10 )
Microsoft C++ compiler ( rating is 8 out of 10 )
2. Very good binary codes generation to achieve very good throughput:
Intel C++ compiler ( rating is 5 out of 10 )
3. Good binary codes generation but poor throughput ( Not optimized implementation! ):
Borland C++ compiler ( rating is 3 out of 10 )
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Run-Time testing - Extended Tracing - Yes ]
[ Microsoft C++ compiler ]
Application - ScaLibTestApp - WIN32_MSC ( 32-bit ) - Release
Tests: Start
> Test0001 Start <
**********************************************
Configuration - WIN32_MSC ( 32-bit ) - Release
CTestSet::InitTestEnv - Passed
* CRuntimeSet Start *
...
[ CrtSetThreadPriority ] - Executed in 2896 clock cycles
[ CrtClflush ] - Executed in 84 clock cycles
[ CrtClflush ] - Executed in 104 clock cycles
[ CrtClflush ] - Executed in 104 clock cycles
[ CrtClflush ] - Executed in 116 clock cycles
[ CrtClflush ] - Executed in 104 clock cycles
[ CrtClflush ] - Executed in 104 clock cycles
[ CrtClflush ] - Executed in 92 clock cycles
[ CrtClflush ] - Executed in 92 clock cycles
[ CrtClflush ] - Executed in 92 clock cycles
[ CrtClflush ] - Executed in 198725 clock cycles
[ CrtSetThreadPriority ] - Executed in 3280 clock cycles
IrtClflush & CrtClflush
...
* CRuntimeSet End *
Test Completed in 7140 ticks
> Test0001 End <
Tests: Completed
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Run-Time testing - Extended Tracing - Yes ]
[ Borland C++ compiler ]
Application - BccTestApp - WIN32_BCC ( 32-bit ) - Release
Tests: Start
> Test0001 Start <
**********************************************
Configuration - WIN32_BCC ( 32-bit ) - Release
CTestSet::InitTestEnv - Passed
* CRuntimeSet Start *
...
[ CrtSetThreadPriority ] - Executed in 28364 clock cycles
[ CrtClflush ] - Executed in 120 clock cycles
[ CrtClflush ] - Executed in 368 clock cycles
[ CrtClflush ] - Executed in 100 clock cycles
[ CrtClflush ] - Executed in 368 clock cycles
[ CrtClflush ] - Executed in 100 clock cycles
[ CrtClflush ] - Executed in 308 clock cycles
[ CrtClflush ] - Executed in 376 clock cycles
[ CrtClflush ] - Executed in 372 clock cycles
[ CrtClflush ] - Executed in 112 clock cycles
[ CrtClflush ] - Executed in 156735 clock cycles
[ CrtSetThreadPriority ] - Executed in 11976 clock cycles
IrtClflush & CrtClflush
...
* CRuntimeSet End *
Test Completed in 9234 ticks
> Test0001 End <
Tests: Completed
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Run-Time testing - Extended Tracing - Yes ]
[ Intel C++ compiler ]
Application - IccTestApp - WIN32_ICC ( 32-bit ) - Release
Tests: Start
> Test0001 Start <
**********************************************
Configuration - WIN32_ICC ( 32-bit ) - Release
CTestSet::InitTestEnv - Passed
* CRuntimeSet Start *
...
[ CrtSetThreadPriority ] - Executed in 2400 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 221809 clock cycles
[ CrtSetThreadPriority ] - Executed in 6548 clock cycles
IrtClflush & CrtClflush
...
* CRuntimeSet End *
Test Completed in 4516 ticks
> Test0001 End <
Tests: Completed
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Run-Time testing - Extended Tracing - Yes ]
[ MinGW C++ compiler ]
Application - MgwTestApp - WIN32_MGW ( 32-bit ) - Release
Tests: Start
> Test0001 Start <
**********************************************
Configuration - WIN32_MGW ( 32-bit ) - Release
CTestSet::InitTestEnv - Passed
* CRuntimeSet Start *
...
[ CrtSetThreadPriority ] - Executed in 3128 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 171099 clock cycles
[ CrtSetThreadPriority ] - Executed in 4284 clock cycles
IrtClflush & CrtClflush
...
* CRuntimeSet End *
Test Completed in 3516 ticks
> Test0001 End <
Tests: Completed
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Run-Time testing - Extended Tracing - Yes ]
[ Watcom C++ compiler ]
Application - WccTestApp - WIN32_WCC ( 32-bit ) - Release
Tests: Start
> Test0001 Start <
**********************************************
Configuration - WIN32_WCC ( 32-bit ) - Release
CTestSet::InitTestEnv - Passed
* CRuntimeSet Start *
...
[ CrtSetThreadPriority ] - Executed in 3776 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 88 clock cycles
[ CrtClflush ] - Executed in 173828 clock cycles
[ CrtSetThreadPriority ] - Executed in 4908 clock cycles
IrtClflush & CrtClflush
...
* CRuntimeSet End *
Test Completed in 8000 ticks
> Test0001 End <
Tests: Completed
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Flush Cache Win32 API functions on Windows Desktop and Embedded OSs ]
[ Win32 API function - FlushInstructionCache ]
Windows Desktop - [ winbase.h ]
...
BOOL WINAPI FlushInstructionCache(
__in HANDLE hProcess,
__in_bcount_opt( dwSize ) LPCVOID lpBaseAddress,
__in SIZE_T dwSize );
...
Windows CE - [ winbase.h ]
...
BOOL WINAPI FlushInstructionCache(
HANDLE hProcess,
LPCVOID lpBaseAddress,
DWORD dwSize );
...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Flush Cache intrinsic on Windows Embedded OSs ]
Windows CE - [ cmnintrin.h ]
...
__CacheRelease( void *p );
...
When compiling with Microsoft C++ compiler a warning C4732 is displayed when
an intrinsic '_CacheRelease' is not supported on an architecture.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[ Flush Cache intrinsic on Itanium IA64 architecture ]
Itanium IA64 Architecture
...
__fc( __int64 *p );
...

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page