Software Archive
Read-only legacy content
17061 Discussions

Performance Evaluation of Matrix Identity algorithms

SergeyKostrov
Valued Contributor II
271 Views
*** Performance Evaluation of Matrix Identity algorithms *** [ Computer System used for performance evaluations ] ** Dell Precision Mobile M4700 ** Intel Core i7-3840QM ( 2.80 GHz ) Ivy Bridge / 4 cores / 8 logical CPUs / ark.intel.com/products/70846 32GB RAM 320GB HDD NVIDIA Quadro K1000M ( 192 CUDA cores / 2GB memory ) Windows 7 Professional 64-bit SP1 Size of L3 Cache = 8MB ( shared between all cores for data & instructions ) Size of L2 Cache = 1MB ( 256KB per core / shared for data & instructions ) Size of L1 Cache = 256KB ( 32KB per core for data & 32KB per core for instructions ) Display resolution: 1366 x 768
0 Kudos
13 Replies
SergeyKostrov
Valued Contributor II
271 Views
Matrix Identity Algorithm ( 32-bit ): 1024 x 1024 [ Tests Set 1 ( 32-bit ) - Matrix Size: 1024 x 1024 ] [ Microsoft C++ compiler ] Matrix Size: 1024 x 1024 Processing... Identity - Pass 01 - Completed: 5.81250 ticks Identity - Pass 02 - Completed: 5.87500 ticks Identity - Pass 03 - Completed: 5.87500 ticks Identity - Pass 04 - Completed: 5.81250 ticks Identity - Pass 05 - Completed: 5.87500 ticks Identity - Passed [ Borland C++ compiler ] Matrix Size: 1024 x 1024 Processing... Identity - Pass 01 - Completed: 4.87500 ticks Identity - Pass 02 - Completed: 4.87500 ticks Identity - Pass 03 - Completed: 5.87500 ticks Identity - Pass 04 - Completed: 5.87500 ticks Identity - Pass 05 - Completed: 5.87500 ticks Identity - Passed [ Intel C++ compiler ] Matrix Size: 1024 x 1024 Processing... Identity - Pass 01 - Completed: 1.93750 ticks Identity - Pass 02 - Completed: 2.00000 ticks Identity - Pass 03 - Completed: 1.93750 ticks Identity - Pass 04 - Completed: 1.93750 ticks Identity - Pass 05 - Completed: 2.93750 ticks Identity - Passed [ MinGW C++ compiler ] Matrix Size: 1024 x 1024 Processing... Identity - Pass 01 - Completed: 5.87500 ticks Identity - Pass 02 - Completed: 5.87500 ticks Identity - Pass 03 - Completed: 6.81250 ticks Identity - Pass 04 - Completed: 5.87500 ticks Identity - Pass 05 - Completed: 5.81250 ticks Identity - Passed [ Watcom C++ compiler ] Matrix Size: 1024 x 1024 Processing... Identity - Pass 01 - Completed: 5.81250 ticks Identity - Pass 02 - Completed: 5.87500 ticks Identity - Pass 03 - Completed: 4.87500 ticks Identity - Pass 04 - Completed: 5.87500 ticks Identity - Pass 05 - Completed: 5.87500 ticks Identity - Passed Note: 1 sec = 1000 ticks
0 Kudos
SergeyKostrov
Valued Contributor II
271 Views
Matrix Identity Algorithm - Code Analysis [ Microsoft C++ compiler ] ... template < class T > _RTINLINE RTvoid _MatrixIdentityProcessingCRv2A( T * _RTRESTRICT ptS, RTssize_t iRows, RTssize_t iCols, RTint iNumOfThreads ) { ... 0024355C xor eax, eax 0024355E rep stos dword ptr es:[edi] ... } ... [ Borland C++ compiler ] ... template < class T > _RTINLINE RTvoid _MatrixIdentityProcessingCRv2A( T * _RTRESTRICT ptS, RTssize_t iRows, RTssize_t iCols, RTint iNumOfThreads ) { ... 00403069 xor ecx, ecx 0040306B inc edx 0040306C mov dword ptr [eax], ecx 0040306E add eax, 4 00403071 cmp edi, edx 00403073 jg 00403069 ... } ... [ Intel C++ compiler ] ... template < class T > _RTINLINE RTvoid _MatrixIdentityProcessingCRv2A( T * _RTRESTRICT ptS, RTssize_t iRows, RTssize_t iCols, RTint iNumOfThreads ) { ... 00401096 pxor xmm0, xmm0 0040109A movntps xmmword ptr [ebx+ecx*4], xmm0 0040109E movntps xmmword ptr [ebx+ecx*4+10h], xmm0 004010A3 movntps xmmword ptr [ebx+ecx*4+20h], xmm0 004010A8 movntps xmmword ptr [ebx+ecx*4+30h], xmm0 004010AD movntps xmmword ptr [ebx+ecx*4+40h], xmm0 004010B2 movntps xmmword ptr [ebx+ecx*4+50h], xmm0 004010B7 movntps xmmword ptr [ebx+ecx*4+60h], xmm0 004010BC movntps xmmword ptr [ebx+ecx*4+70h], xmm0 004010C1 add ecx,20h 004010C4 cmp ecx,eax 004010C6 jb 0040109A ... } ... [ MinGW C++ compiler ] ... template < class T > _RTINLINE RTvoid _MatrixIdentityProcessingCRv2A( T * _RTRESTRICT ptS, RTssize_t iRows, RTssize_t iCols, RTint iNumOfThreads ) { ... 0040A5AE call _memset( 0042FD28h ) ... } ... [ Watcom C++ compiler ] ... template < class T > _RTINLINE RTvoid _MatrixIdentityProcessingCRv2A( T * _RTRESTRICT ptS, RTssize_t iRows, RTssize_t iCols, RTint iNumOfThreads ) { ... 0040C9C7 cmp eax, ebp 0040C9C9 jge 0040C9ED 0040C9CB mov edx, eax 0040C9CD mov dword ptr [ebx+edx*4], 0 0040C9D4 inc eax 0040C9D5 jmp 0040C9C7 ... } ...
0 Kudos
SergeyKostrov
Valued Contributor II
271 Views
Matrix Identity Algorithm ( 32-bit ): 2048 x 2048 [ Tests Set 2 ( 32-bit ) - Matrix Size: 2048 x 2048 ] [ Microsoft C++ compiler ] Matrix Size: 2048 x 2048 Processing... Identity - Pass 01 - Completed: 32.25000 ticks Identity - Pass 02 - Completed: 32.18750 ticks Identity - Pass 03 - Completed: 32.25000 ticks Identity - Pass 04 - Completed: 32.18750 ticks Identity - Pass 05 - Completed: 33.25000 ticks Identity - Passed [ Borland C++ compiler ] Matrix Size: 2048 x 2048 Processing... Identity - Pass 01 - Completed: 31.25000 ticks Identity - Pass 02 - Completed: 32.12500 ticks Identity - Pass 03 - Completed: 33.18750 ticks Identity - Pass 04 - Completed: 32.25000 ticks Identity - Pass 05 - Completed: 32.25000 ticks Identity - Passed [ Intel C++ compiler ] Matrix Size: 2048 x 2048 Processing... Identity - Pass 01 - Completed: 12.68750 ticks Identity - Pass 02 - Completed: 11.75000 ticks Identity - Pass 03 - Completed: 12.68750 ticks Identity - Pass 04 - Completed: 12.68750 ticks Identity - Pass 05 - Completed: 12.68750 ticks Identity - Passed [ MinGW C++ compiler ] Matrix Size: 2048 x 2048 Processing... Identity - Pass 01 - Completed: 32.25000 ticks Identity - Pass 02 - Completed: 33.18750 ticks Identity - Pass 03 - Completed: 32.25000 ticks Identity - Pass 04 - Completed: 32.18750 ticks Identity - Pass 05 - Completed: 33.25000 ticks Identity - Passed [ Watcom C++ compiler ] Matrix Size: 2048 x 2048 Processing... Identity - Pass 01 - Completed: 31.25000 ticks Identity - Pass 02 - Completed: 31.25000 ticks Identity - Pass 03 - Completed: 30.31250 ticks Identity - Pass 04 - Completed: 31.25000 ticks Identity - Pass 05 - Completed: 31.25000 ticks Identity - Passed Note: 1 sec = 1000 ticks
0 Kudos
SergeyKostrov
Valued Contributor II
271 Views
Matrix Identity Algorithm ( 32-bit ): 4096 x 4096 [ Tests Set 3 ( 32-bit ) - Matrix Size: 4096 x 4096 ] [ Microsoft C++ compiler ] Matrix Size: 4096 x 4096 Processing... Identity - Pass 01 - Completed: 128.93750 ticks Identity - Pass 02 - Completed: 127.93750 ticks Identity - Pass 03 - Completed: 127.93750 ticks Identity - Pass 04 - Completed: 126.93750 ticks Identity - Pass 05 - Completed: 128.93750 ticks Identity - Passed [ Borland C++ compiler ] Matrix Size: 4096 x 4096 Processing... Identity - Pass 01 - Completed: 128.93750 ticks Identity - Pass 02 - Completed: 126.93750 ticks Identity - Pass 03 - Completed: 125.00000 ticks Identity - Pass 04 - Completed: 125.00000 ticks Identity - Pass 05 - Completed: 124.00000 ticks Identity - Passed [ Intel C++ compiler ] Matrix Size: 4096 x 4096 Processing... Identity - Pass 01 - Completed: 48.81250 ticks Identity - Pass 02 - Completed: 47.87500 ticks Identity - Pass 03 - Completed: 48.81250 ticks Identity - Pass 04 - Completed: 48.81250 ticks Identity - Pass 05 - Completed: 48.87500 ticks Identity - Passed [ MinGW C++ compiler ] Matrix Size: 4096 x 4096 Processing... Identity - Pass 01 - Completed: 127.93750 ticks Identity - Pass 02 - Completed: 126.93750 ticks Identity - Pass 03 - Completed: 127.93750 ticks Identity - Pass 04 - Completed: 127.93750 ticks Identity - Pass 05 - Completed: 126.93750 ticks Identity - Passed [ Watcom C++ compiler ] Matrix Size: 4096 x 4096 Processing... Identity - Pass 01 - Completed: 124.00000 ticks Identity - Pass 02 - Completed: 124.00000 ticks Identity - Pass 03 - Completed: 124.06250 ticks Identity - Pass 04 - Completed: 124.00000 ticks Identity - Pass 05 - Completed: 125.00000 ticks Identity - Passed Note: 1 sec = 1000 ticks
0 Kudos
SergeyKostrov
Valued Contributor II
271 Views
Matrix Identity Algorithm ( 32-bit ): 8192 x 8192 [ Tests Set 4 ( 32-bit ) - Matrix Size: 8192 x 8192 ] [ Microsoft C++ compiler ] Matrix Size: 8192 x 8192 Processing... Identity - Pass 01 - Completed: 362.81250 ticks Identity - Pass 02 - Completed: 363.87500 ticks Identity - Pass 03 - Completed: 362.81250 ticks Identity - Pass 04 - Completed: 362.87500 ticks Identity - Pass 05 - Completed: 362.81250 ticks Identity - Passed [ Borland C++ compiler ] Matrix Size: 8192 x 8192 Processing... Identity - Pass 01 - Completed: 361.37500 ticks Identity - Pass 02 - Completed: 361.31250 ticks Identity - Pass 03 - Completed: 361.31250 ticks Identity - Pass 04 - Completed: 361.31250 ticks Identity - Pass 05 - Completed: 360.37500 ticks Identity - Passed [ Intel C++ compiler ] Matrix Size: 8192 x 8192 Processing... Identity - Pass 01 - Completed: 134.75000 ticks Identity - Pass 02 - Completed: 133.81250 ticks Identity - Pass 03 - Completed: 134.75000 ticks Identity - Pass 04 - Completed: 133.81250 ticks Identity - Pass 05 - Completed: 133.75000 ticks Identity - Passed [ MinGW C++ compiler ] Matrix Size: 8192 x 8192 Processing... Identity - Pass 01 - Completed: 356.50000 ticks Identity - Pass 02 - Completed: 357.37500 ticks Identity - Pass 03 - Completed: 357.43750 ticks Identity - Pass 04 - Completed: 356.43750 ticks Identity - Pass 05 - Completed: 356.43750 ticks Identity - Passed [ Watcom C++ compiler ] Matrix Size: 8192 x 8192 Processing... Identity - Pass 01 - Completed: 345.68750 ticks Identity - Pass 02 - Completed: 346.68750 ticks Identity - Pass 03 - Completed: 346.68750 ticks Identity - Pass 04 - Completed: 345.68750 ticks Identity - Pass 05 - Completed: 346.68750 ticks Identity - Passed Note: 1 sec = 1000 ticks
0 Kudos
SergeyKostrov
Valued Contributor II
271 Views
Matrix Identity Algorithm ( 64-bit ): 16384 x 16384 [ Tests Set 5 ( 64-bit ) - Matrix Size: 16384 x 16384 ] [ Microsoft C++ compiler ] Matrix Size: 16384 x 16384 Processing... Identity - Pass 01 - Completed: 62.00000 ticks Identity - Pass 02 - Completed: 62.00000 ticks Identity - Pass 03 - Completed: 47.00000 ticks Identity - Pass 04 - Completed: 63.00000 ticks Identity - Pass 05 - Completed: 46.00000 ticks Identity - Passed [ Intel C++ compiler ] Matrix Size: 16384 x 16384 Processing... Identity - Pass 01 - Completed: 63.00000 ticks Identity - Pass 02 - Completed: 47.00000 ticks Identity - Pass 03 - Completed: 62.00000 ticks Identity - Pass 04 - Completed: 47.00000 ticks Identity - Pass 05 - Completed: 62.00000 ticks Identity - Passed [ MinGW C++ compiler ] Matrix Size: 16384 x 16384 Processing... Identity - Pass 01 - Completed: 62.00000 ticks Identity - Pass 02 - Completed: 47.00000 ticks Identity - Pass 03 - Completed: 63.00000 ticks Identity - Pass 04 - Completed: 46.00000 ticks Identity - Pass 05 - Completed: 63.00000 ticks Identity - Passed Note: 1 sec = 1000 ticks
0 Kudos
SergeyKostrov
Valued Contributor II
271 Views
Matrix Identity Algorithm ( 64-bit ): 32768 x 32768 [ Tests Set 6 ( 64-bit ) - Matrix Size: 32768 x 32768 ] [ Microsoft C++ compiler ] Matrix Size: 32768 x 32768 Processing... Identity - Pass 01 - Completed: 218.00000 ticks Identity - Pass 02 - Completed: 218.00000 ticks Identity - Pass 03 - Completed: 219.00000 ticks Identity - Pass 04 - Completed: 218.00000 ticks Identity - Pass 05 - Completed: 219.00000 ticks Identity - Passed [ Intel C++ compiler ] Matrix Size: 32768 x 32768 Processing... Identity - Pass 01 - Completed: 219.00000 ticks Identity - Pass 02 - Completed: 218.00000 ticks Identity - Pass 03 - Completed: 219.00000 ticks Identity - Pass 04 - Completed: 218.00000 ticks Identity - Pass 05 - Completed: 218.00000 ticks Identity - Passed [ MinGW C++ compiler ] Matrix Size: 32768 x 32768 Processing... Identity - Pass 01 - Completed: 218.00000 ticks Identity - Pass 02 - Completed: 219.00000 ticks Identity - Pass 03 - Completed: 218.00000 ticks Identity - Pass 04 - Completed: 219.00000 ticks Identity - Pass 05 - Completed: 218.00000 ticks Identity - Passed Note: 1 sec = 1000 ticks
0 Kudos
SergeyKostrov
Valued Contributor II
271 Views
Matrix Identity Algorithm ( 64-bit ): 65536 x 65536 [ Tests Set 7 ( 64-bit ) - Matrix Size: 65536 x 65536 ] [ Microsoft C++ compiler ] Matrix Size: 65536 x 65536 Processing... Identity - Pass 01 - Completed: 873.00000 ticks Identity - Pass 02 - Completed: 890.00000 ticks Identity - Pass 03 - Completed: 873.00000 ticks Identity - Pass 04 - Completed: 874.00000 ticks Identity - Pass 05 - Completed: 874.00000 ticks Identity - Passed [ Intel C++ compiler ] Matrix Size: 65536 x 65536 Processing... Identity - Pass 01 - Completed: 874.00000 ticks Identity - Pass 02 - Completed: 874.00000 ticks Identity - Pass 03 - Completed: 873.00000 ticks Identity - Pass 04 - Completed: 874.00000 ticks Identity - Pass 05 - Completed: 873.00000 ticks Identity - Passed [ MinGW C++ compiler ] Matrix Size: 65536 x 65536 Processing... Identity - Pass 01 - Completed: 874.00000 ticks Identity - Pass 02 - Completed: 873.00000 ticks Identity - Pass 03 - Completed: 874.00000 ticks Identity - Pass 04 - Completed: 873.00000 ticks Identity - Pass 05 - Completed: 874.00000 ticks Identity - Passed Note: 1 sec = 1000 ticks
0 Kudos
SergeyKostrov
Valued Contributor II
271 Views
Matrix Identity Algorithm ( 64-bit ): 81920 x 81920 [ Tests Set 8 ( 64-bit ) - Matrix Size: 81920 x 81920 ] [ Microsoft C++ compiler ] Not Tested [ Intel C++ compiler ] Not Tested [ MinGW C++ compiler ] Not Tested Note: 1 sec = 1000 ticks
0 Kudos
SergeyKostrov
Valued Contributor II
271 Views
Matrix Identity Algorithm ( 64-bit ): 131072 x 131072 [ Tests Set 9 ( 64-bit ) - Matrix Size: 131072 x 131072 ] [ Microsoft C++ compiler ] Not Tested [ Intel C++ compiler ] Not Tested [ MinGW C++ compiler ] Not Tested Note: 1 sec = 1000 ticks
0 Kudos
Bernard
Valued Contributor I
271 Views

Hi Sergey,

 

Nice set of tests.

Why do you test Watcom Compiler if it it lacks support of SIMD SSE architecture extensions?

 

0 Kudos
SergeyKostrov
Valued Contributor II
271 Views
>>...Why do you test Watcom Compiler if it it lacks support of SIMD SSE architecture extensions? There is nothing wrong with it because I always test all major C++ compilers ( there are 6 of them ) supported on the project I've been working on. Next, take a look at ... Matrix Identity Algorithm ( 32-bit ): 4096 x 4096 [ Tests Set 3 ( 32-bit ) - Matrix Size: 4096 x 4096 ] ... test cases and you will see that Watcom and Borland C++ compilers did a good job compared to Microsoft and MinGW C++ compilers, but Intel C++ compilers more than twice outperformed all of them. Another thing is that these tests clearly demonstrated that a very good quality of binary codes generation is Not enough to be competitive in modern times and this is the case with Watcom and Borland C++ compilers. PS: Turbo C++ compiler ( 16-bit ) is Not used because it plays a different role as an Overall Source Codes Verifier.
0 Kudos
SergeyKostrov
Valued Contributor II
271 Views
>>Another thing is that these tests clearly demonstrated that a very good quality of binary codes generation is Not >>enough to be competitive in modern times and this is the case with Watcom and Borland C++ compilers... Here are a couple of notes: - Borland C++ compiler is No longer supported and will never support latest Intel ISAs ( Instruction Set Architectures ); - Watcom C++ compiler could support it, but unfortunately, I don't see any progress in that direction and it is Not clear for me what Open Watcom C++ compiler team is currently doing. I recently integrated version 2.0 and it is Not too much different from version 1.9 and SSE is still Not supported. Take a look at 3rd post of the thread Matrix Identity Algorithm - Code Analysis and you will see why Intel C++ compiler outperformed all the rest C++ compilers on 32-bit tests. This is because it uses Non-Temporal moves for a 1st stage of the algorithm and Unrolls processing to 1-to-32 Loop Unrolling Schema ( LPS ). You could also see that on 64-bit tests all C++ compilers, that is, Intel, Microsoft and MinGW, showed identical performance.
0 Kudos
Reply