Software Archive
Read-only legacy content
17061 Discussions

Oversubscription of OpenMP threads for processing small data sets

SergeyKostrov
Valued Contributor II
3,530 Views
*** Oversubscription of OpenMP threads for processing small data sets ***
0 Kudos
87 Replies
SergeyKostrov
Valued Contributor II
264 Views
[ Test Case 78 - 64-bit / Cores: 4 / CPUs: 8 - Number of OpenMP threads - 4 ] [ Matrix Size: 2048 x 2048 ] Application - IccTestApp - WIN64_ICC ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 2048 x 2048 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 0.22131 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 2048x2048 elements Completed: 0.21012 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 0.76294 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 2048x2048 elements Completed: 0.76247 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 0.25250 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 2048x2048 elements Completed: 0.25253 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 0.43875 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 2048x2048 elements Completed: 0.43922 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 2.35756 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 2048x2048 elements Completed: 0.73563 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
264 Views
[ Test Case 79 - 64-bit / Cores: 4 / CPUs: 8 - Number of OpenMP threads - 8 ] [ Matrix Size: 2048 x 2048 ] Application - IccTestApp - WIN64_ICC ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 2048 x 2048 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 0.33150 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 2048x2048 elements Completed: 0.32612 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 1.20853 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 2048x2048 elements Completed: 1.18172 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 0.40413 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 2048x2048 elements Completed: 0.40366 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 0.77316 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 2048x2048 elements Completed: 0.76394 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 3.34672 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 2048x2048 elements Completed: 0.82778 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
264 Views
[ Test Case 80 - 64-bit / Cores: 4 / CPUs: 8 - Number of OpenMP threads - 16 ] [ Matrix Size: 2048 x 2048 ] Application - IccTestApp - WIN64_ICC ( 64-bit ) - Release Tests: Start > Test1099 Start < Matrix A, B and C Sizes : 2048 x 2048 Loop Processing Schema ( LPS ): IKJ Loop Blocking Divider : 1 Sub-Test 1.1 - MxMultA1 - Classic 2D LBOT size: N/A Completed: 0.27250 secs Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT LBOT size: 2048x2048 elements Completed: 0.27250 secs Sub-Test 1.3 - MxMultA3 - Classic 2D Fused LBOT size: N/A Completed: 1.00425 secs Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT LBOT size: 2048x2048 elements Completed: 0.99791 secs Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed LBOT size: N/A Completed: 0.32075 secs Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT LBOT size: 2048x2048 elements Completed: 0.33444 secs Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed LBOT size: N/A Completed: 0.59572 secs Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT LBOT size: 2048x2048 elements Completed: 0.57916 secs Sub-Test 5.1 - MxMultD1 - Classic 1D LBOT size: N/A Completed: 2.67250 secs Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT LBOT size: 2048x2048 elements Completed: 0.65469 secs > Test1099 End < Tests: Completed
0 Kudos
SergeyKostrov
Valued Contributor II
264 Views
If somebody is interested in performance analysis than a data mining needs to be done ( manually ).
0 Kudos
Reply