Software Archive
Read-only legacy content
Announcements
FPGA community forums and blogs have moved to the Altera Community. Existing Intel Community members can sign in with their current credentials.
17060 Discussions

Performance Evaluation of MinGW v6.1.0 C++ compiler ( OpenMP Scalability )

SergeyKostrov
Valued Contributor II
1,278 Views
*** Performance Evaluation of MinGW v6.1.0 C++ compiler ( OpenMP Scalability ) ***
0 Kudos
14 Replies
SergeyKostrov
Valued Contributor II
1,278 Views
[ Computer System used for performance evaluations ] ** Dell Precision Mobile M4700 ** Intel Core i7-3840QM ( 2.80 GHz ) Ivy Bridge / 4 cores / 8 logical CPUs / ark.intel.com/products/70846 32GB RAM 320GB HDD NVIDIA Quadro K1000M ( 192 CUDA cores / 2GB memory ) Windows 7 Professional 64-bit Size of L3 Cache = 8MB ( shared between all cores for data & instructions ) Size of L2 Cache = 1MB ( 256KB per core / shared for data & instructions ) Size of L1 Cache = 256KB ( 32KB per core for data & 32KB per core for instructions ) Display resolution: 1366 x 768
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ MinGW v6.1.0 C++ compiler command line options ] -DNDEBUG -O3 -mavx -mprfchw -mhard-float -ffast-math -fpeel-loops -ftree-vectorizer-verbose=0 -ftree-vectorize -fvect-cost-model -fomit-frame-pointer -fwhole-program -fopenmp -fopenmp-simd -falign-functions -falign-jumps -falign-labels -falign-loops -freorder-blocks -freorder-functions --param l1-cache-line-size=64 --param l1-cache-size=262144 --param l2-cache-size=1048576 -w -Xlinker --stack=1073741824
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ Matrix Dimensions 1024 x 1024 ] [ Number of OpenMP threads: 1 ] Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 1024 x 1024 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 0.25000 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 1024x1024 elements Completed: 0.26500 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 0.53000 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 1024x1024 elements Completed: 0.53100 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 9.39100 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 1024x1024 elements Completed: 9.37600 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 9.65600 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 1024x1024 elements Completed: 9.64100 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 0.29600 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 1024x1024 elements Completed: 0.34300 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ Matrix Dimensions 1024 x 1024 ] [ Number of OpenMP threads: 2 ] Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 1024 x 1024 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 0.12500 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 1024x1024 elements Completed: 0.14100 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 0.28000 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 1024x1024 elements Completed: 0.28100 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 4.69600 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 1024x1024 elements Completed: 4.69600 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 4.88300 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 1024x1024 elements Completed: 4.86700 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 0.14100 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 1024x1024 elements Completed: 0.15600 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ Matrix Dimensions 1024 x 1024 ] [ Number of OpenMP threads: 4 ] Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 1024 x 1024 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 0.07800 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 1024x1024 elements Completed: 0.06200 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 0.15600 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 1024x1024 elements Completed: 0.14100 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 2.35600 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 1024x1024 elements Completed: 2.88600 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 2.57400 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 1024x1024 elements Completed: 2.90200 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 0.10900 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 1024x1024 elements Completed: 0.14000 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ Matrix Dimensions 2048 x 2048 ] [ Number of OpenMP threads: 1 ] Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 2048 x 2048 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 2.68400 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 2048x2048 elements Completed: 2.66700 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 5.10100 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 2048x2048 elements Completed: 5.10100 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 88.56100 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 2048x2048 elements Completed: 88.56200 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 102.00900 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 2048x2048 elements Completed: 102.07101 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 3.10400 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 2048x2048 elements Completed: 3.05800 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ Matrix Dimensions 2048 x 2048 ] [ Number of OpenMP threads: 2 ] Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 2048 x 2048 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 1.68500 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 2048x2048 elements Completed: 1.73100 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 3.57300 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 2048x2048 elements Completed: 3.57200 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 46.89300 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 2048x2048 elements Completed: 46.91000 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 50.48100 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 2048x2048 elements Completed: 50.42000 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 1.62200 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 2048x2048 elements Completed: 1.60700 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ Matrix Dimensions 2048 x 2048 ] [ Number of OpenMP threads: 4 ] Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 2048 x 2048 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 0.78000 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 2048x2048 elements Completed: 0.76400 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 1.54400 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 2048x2048 elements Completed: 1.54400 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 29.17200 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 2048x2048 elements Completed: 28.93800 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 32.80700 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 2048x2048 elements Completed: 32.88500 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 1.03000 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 2048x2048 elements Completed: 0.92000 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ Matrix Dimensions 4096 x 4096 ] [ Number of OpenMP threads: 1 ] Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 4096 x 4096 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 21.91800 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 4096x4096 elements Completed: 21.88700 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 44.11700 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 4096x4096 elements Completed: 44.02400 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 844.13702 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 4096x4096 elements Completed: 844.07501 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 1027.29810 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 4096x4096 elements Completed: 1027.62500 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 25.77100 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 4096x4096 elements Completed: 24.86600 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ Matrix Dimensions 4096 x 4096 ] [ Number of OpenMP threads: 2 ] Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 4096 x 4096 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 14.04000 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 4096x4096 elements Completed: 14.78900 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 28.37700 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 4096x4096 elements Completed: 27.56500 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 430.07901 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 4096x4096 elements Completed: 431.88901 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 502.87003 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 4096x4096 elements Completed: 502.76001 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 14.27400 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 4096x4096 elements Completed: 14.22700 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ Matrix Dimensions 4096 x 4096 ] [ Number of OpenMP threads: 4 ] Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 4096 x 4096 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 6.24000 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 4096x4096 elements Completed: 6.22400 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 12.69800 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 4096x4096 elements Completed: 12.82400 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 312.65802 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 4096x4096 elements Completed: 315.32501 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 336.08902 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 4096x4096 elements Completed: 333.76401 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 9.70300 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 4096x4096 elements Completed: 9.36000 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ Matrix Dimensions 8192 x 8192 ] [ Number of OpenMP threads: 1 ] Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 8192 x 8192 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 208.41701 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 8192x8192 elements Completed: 208.35501 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 352.32800 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 8192x8192 elements Completed: 352.21902 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 8496.01855 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 8192x8192 elements Completed: 8495.95508 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 8963.55273 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 8192x8192 elements Completed: 8964.23828 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 242.39500 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 8192x8192 elements Completed: 230.89801 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ Matrix Dimensions 8192 x 8192 ] [ Number of OpenMP threads: 2 ] Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 8192 x 8192 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 114.75500 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 8192x8192 elements Completed: 114.13000 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 238.68102 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 8192x8192 elements Completed: 238.10400 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 4389.88428 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 8192x8192 elements Completed: 4391.36621 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 4776.03320 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 8192x8192 elements Completed: 4763.30322 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 124.28600 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 8192x8192 elements Completed: 119.43401 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
1,278 Views
[ Matrix Dimensions 8192 x 8192 ] [ Number of OpenMP threads: 4 ] Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 8192 x 8192 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 55.88000 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 8192x8192 elements Completed: 56.81500 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 105.02000 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 8192x8192 elements Completed: 107.12601 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 2753.40210 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 8192x8192 elements Completed: 2728.22314 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 2736.28809 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 8192x8192 elements Completed: 2733.94922 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 65.91000 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 8192x8192 elements Completed: 64.41200 secs > Test1099 End < Tests: Completed
0 Kudos
Reply