<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic [ MinGW C++ compiler v6.1.0 in Software Archive</title>
    <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088158#M64296</link>
    <description>&lt;STRONG&gt;[ MinGW C++ compiler v6.1.0 64-bit ]&lt;/STRONG&gt;

		...
		Data Set Size       : 1048576 elements ( 1024 x 1024 )
		Number of Tests     : 5
		Number of Threads   : 4
		...

		&lt;STRONG&gt;ALGORITHM_MULTIPLYCLASSIC - Transposed&lt;/STRONG&gt;

		_MatrixMulProcessingCTUnRv1A - Pass 01 - Completed:     0.29600 secs
		_MatrixMulProcessingCTUnRv1A - Pass 02 - Completed:     0.28100 secs
		_MatrixMulProcessingCTUnRv1A - Pass 03 - Completed:     0.28100 secs
		_MatrixMulProcessingCTUnRv1A - Pass 04 - Completed:     0.26500 secs
		_MatrixMulProcessingCTUnRv1A - Pass 05 - Completed:     0.28100 secs

		_MatrixMulProcessingCTv1B    - Pass 01 - Completed:     0.07800 secs
		_MatrixMulProcessingCTv1B    - Pass 02 - Completed:     0.09400 secs
		_MatrixMulProcessingCTv1B    - Pass 03 - Completed:     0.07800 secs
		_MatrixMulProcessingCTv1B    - Pass 04 - Completed:     0.09300 secs
		_MatrixMulProcessingCTv1B    - Pass 05 - Completed:     0.07800 secs

		_MatrixMulProcessingCTv1C    - Pass 01 - Completed:     0.06300 secs
		_MatrixMulProcessingCTv1C    - Pass 02 - Completed:     0.07800 secs
		_MatrixMulProcessingCTv1C    - Pass 03 - Completed:     0.06200 secs
		_MatrixMulProcessingCTv1C    - Pass 04 - Completed:     0.07800 secs
		_MatrixMulProcessingCTv1C    - Pass 05 - Completed:     0.06200 secs

		_MatrixMulProcessingCTv1D    - Pass 01 - Completed:     0.06300 secs
		_MatrixMulProcessingCTv1D    - Pass 02 - Completed:     0.06200 secs
		_MatrixMulProcessingCTv1D    - Pass 03 - Completed:     0.09400 secs
		_MatrixMulProcessingCTv1D    - Pass 04 - Completed:     0.06200 secs
		_MatrixMulProcessingCTv1D    - Pass 05 - Completed:     0.07800 secs

		_MatrixMulProcessingCTv1E    - Pass 01 - Completed:     0.06300 secs
		_MatrixMulProcessingCTv1E    - Pass 02 - Completed:     0.07800 secs
		_MatrixMulProcessingCTv1E    - Pass 03 - Completed:     0.10900 secs
		_MatrixMulProcessingCTv1E    - Pass 04 - Completed:     0.09400 secs
		_MatrixMulProcessingCTv1E    - Pass 05 - Completed:     0.09400 secs

		_MatrixMulProcessingCTUnRv2A - Pass 01 - Completed:     0.29600 secs
		_MatrixMulProcessingCTUnRv2A - Pass 02 - Completed:     0.26500 secs
		_MatrixMulProcessingCTUnRv2A - Pass 03 - Completed:     0.26500 secs
		_MatrixMulProcessingCTUnRv2A - Pass 04 - Completed:     0.28100 secs
		_MatrixMulProcessingCTUnRv2A - Pass 05 - Completed:     0.26500 secs

		_MatrixMulProcessingCTv2B    - Pass 01 - Completed:     0.06300 secs
		_MatrixMulProcessingCTv2B    - Pass 02 - Completed:     0.06200 secs
		_MatrixMulProcessingCTv2B    - Pass 03 - Completed:     0.06300 secs
		_MatrixMulProcessingCTv2B    - Pass 04 - Completed:     0.10900 secs
		_MatrixMulProcessingCTv2B    - Pass 05 - Completed:     0.09300 secs

		_MatrixMulProcessingCTv2C    - Pass 01 - Completed:     0.09400 secs
		_MatrixMulProcessingCTv2C    - Pass 02 - Completed:     0.07800 secs
		_MatrixMulProcessingCTv2C    - Pass 03 - Completed:     0.09400 secs
		_MatrixMulProcessingCTv2C    - Pass 04 - Completed:     0.10900 secs
		_MatrixMulProcessingCTv2C    - Pass 05 - Completed:     0.09300 secs</description>
    <pubDate>Fri, 16 Sep 2016 15:16:50 GMT</pubDate>
    <dc:creator>SergeyKostrov</dc:creator>
    <dc:date>2016-09-16T15:16:50Z</dc:date>
    <item>
      <title>Performance Evaluation of Classic Matrix Multiplication algorithms</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088022#M64160</link>
      <description>*** Performance Evaluation of Classic Matrix Multiplication algorithms ***

&lt;STRONG&gt;[ Abstract ]&lt;/STRONG&gt;

This is one of the most detailed analysis of performance of Classic Matrix Multiplication algorithm on different Software and Hardware platforms.</description>
      <pubDate>Thu, 04 Aug 2016 16:25:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088022#M64160</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:25:52Z</dc:date>
    </item>
    <item>
      <title>This is one of the most</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088023#M64161</link>
      <description>This is one of the most detailed analysis of performance of Classic Matrix Multiplication algorithm. The list of different versions of the algorithm is as follows:

			Classic 2D
			Classic 2D LBOT
			Classic 2D Fused
			Classic 2D Fused LBOT
			Classic 2D Transposed
			Classic 2D Transposed LBOT
			Classic 2D Fused Transposed
			Classic 2D Fused Transposed LBOT
			Classic 2D SSE2 Transposed v1
			Classic 2D SSE2 Transposed v1 LBOT
			Classic 2D SSE2 Transposed v2
			Classic 2D SSE2 Transposed v2 LBOT
			Classic 1D
			Classic 1D LBOT

Two sub-versions of each version of the algorithm is evaluated with:

			- Loop Processing Schema IJK
			- Loop Processing Schema IKJ ( aka Loop Interchange technique )

Performance evaluations are done:

		(1) On four computer systems:

			Dell Precision Mobile M4700
			Dell Dimension 4400
			Dell Latitude CPi D300XT
			Acer Aspire One ( netbook )

		(2) On four Operating Systems:

			Windows 95 Pan European 32-bit
			Windows 2000 Professional 32-bit SP4
			Windows XP Professional 32-bit SP3
			Windows 7 Professional 64-bit SP1

		(3) With four IDEs:

			Visual Studio 98 Professional Edition
			Visual Studio 2005 Professional Edition
			Visual Studio 2008 Professional Edition
			Visual Studio 2008 Express Edition

		(4) With twenty two C++ compilers: 

			Borland C++ compiler v5.5.1 32-bit
			MinGW C++ compiler v3.4.2 32-bit
			MinGW C++ compiler v4.8.1 32-bit
			MinGW C++ compiler v4.9.2 32-bit
			MinGW C++ compiler v4.9.2 64-bit
			MinGW C++ compiler v5.1.0 32-bit
			MinGW C++ compiler v5.1.0 64-bit
			MinGW C++ compiler v6.1.0 32-bit
			MinGW C++ compiler v6.1.0 64-bit
			Microsoft C++ compiler ( VS98 PE   ) 32-bit
			Microsoft C++ compiler ( VS2005 PE ) 32-bit
			Microsoft C++ compiler ( VS2008 PE ) 32-bit
			Microsoft C++ compiler ( VS2008 PE ) 64-bit
			Microsoft C++ compiler ( VS2008 EE ) 32-bit
			Intel C++ compiler v7.1.0 ( u029 ) 32-bit
			Intel C++ compiler v8.1.0 ( u038 ) 32-bit
			Intel C++ compiler v12.1.7 ( u371 ) 32-bit
			Intel C++ compiler v13.1.0 ( u149 ) 32-bit
			Intel C++ compiler v13.1.0 ( u149 ) 64-bit
			Watcom C++ compiler v1.9.0 32-bit
			Watcom C++ compiler v2.0.0 32-bit
			Watcom C++ compiler v2.0.0 64-bit</description>
      <pubDate>Thu, 04 Aug 2016 16:29:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088023#M64161</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:29:04Z</dc:date>
    </item>
    <item>
      <title>[ Watcom C++ compiler v2.0.0</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088024#M64162</link>
      <description>&lt;STRONG&gt;[ Watcom C++ compiler v2.0.0 64-bit ]&lt;/STRONG&gt;

Even if the compiler and linker are ported to 64-bit platforms generated binary codes are still 32-bit!</description>
      <pubDate>Thu, 04 Aug 2016 16:31:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088024#M64162</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:31:07Z</dc:date>
    </item>
    <item>
      <title>[ List of Abbreviations ]</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088025#M64163</link>
      <description>&lt;STRONG&gt;[ List of Abbreviations ]&lt;/STRONG&gt;

		MM   - Matrix Multiplication
		C    - Classic
		LPS  - Loop Processing Schema
		1D   - One Dimensional Input Matrices
		2D   - Two Dimensional Input Matrices
		LB   - Loop Blocking			( OT )
		LBOT - Loop Blocking Optimization Technique
		F    - Fused				( OT )
		T    - Transposed			( OT )
		SSE2 - Streaming SIMD Extensions v2
		OT   - Optimization Technique
		PE   - Professional Edition			( of Visual Studio )
		EE   - Express Edition			( of Visual Studio )
		P2   - Intel Pentium PII
		P4   - Intel Pentim 4
		IB   - Intel Ivy Bridge
		AN   - Intel Atom N270</description>
      <pubDate>Thu, 04 Aug 2016 16:32:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088025#M64163</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:32:28Z</dc:date>
    </item>
    <item>
      <title>[ Computer Systems used for</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088026#M64164</link>
      <description>&lt;STRONG&gt;[ Computer Systems used for performance evaluations ]&lt;/STRONG&gt;

&lt;STRONG&gt;** Dell Precision Mobile M4700 **&lt;/STRONG&gt;

			Intel Core i7-3840QM ( 2.80 GHz )
			Ivy Bridge / 4 cores / 8 logical CPUs / ark.intel.com/products/70846
			32GB RAM
			320GB HDD
			NVIDIA Quadro K1000M ( 192 CUDA cores / 2GB memory )
			Windows 7 Professional 64-bit SP1
			Size of L3 Cache =   8MB ( shared between all cores for data &amp;amp; instructions )
			Size of L2 Cache =   1MB ( 256KB per core / shared for data &amp;amp; instructions )
			Size of L1 Cache = 256KB ( 32KB per core for data &amp;amp; 32KB per core for instructions )
			Display resolution: 1366 x 768

&lt;STRONG&gt;** Dell Dimension 4400 **&lt;/STRONG&gt;

			Intel Pentium 4 ( 1.60 GHz / 1 core )
			1GB RAM
			Seagate 20GB HDD						( *  )
			Seagate  3TB HDD						( ** )
			EVGA GeForce 6200 Video Card 512MB DDR2 AGP 8x Video Card
			Windows XP Professional 32-bit SP3
			Size of L2 Cache = 256KB
			Size of L1 Cache =   8KB
			Display resolution: 1440 x 990

			( *  )	Seagate Barracuda 20GB IDE Hard Disk Drive
			ST320011A
			3.5" 7200 Rpm  2MB Cache IDE Ultra ATA100 / ATA-iV/6
			Average Rotational Latency	: 4.17 ms
			Average Seek Times Read		: 9.0ms
			Average Seek Times Write	: 10.0ms
			Maximum Internal Transfer Rate	: 69.4MB/sec
			Average External Transfer Rate	: 100MB/sec ( Read and Write )
			Maximum External Transfer Rate	: 150MB/sec ( Read           )
			Note: Barracuda ATA IV Family

			( ** )	Seagate Barracuda  3TB IDE Hard Disk Drive
			ST3000DM001
			3.5" 7200 Rpm 64MB Cache SATA III ( 6GB/sec )
			Average Rotational Latency	: 4.16 ms
			Average Seek Times Read		: 8.5ms
			Average Seek Times Write	: 9.5ms
			Maximum Internal Transfer Rate	: 268MB/sec
			Average External Transfer Rate	: 156MB/sec ( Read and Write )
			Maximum External Transfer Rate	: 210MB/sec ( Read           )

&lt;STRONG&gt;** Dell Latitude CPi D300XT **&lt;/STRONG&gt;

			Intel Pentium II ( 300 MHz / 1 core )
			128MB RAM ( 2x64MB / MT8LDT864HG-6X 144-pin EDO SODIMM 60ns )
			6GB HDD
			Windows 2000 Professional 32-bit SP4
			Size of L2 Cache = 512KB
			Size of L1 Cache =  16KB
			Display resolution: 1024 x 768

&lt;STRONG&gt;** Acer Aspire One **&lt;/STRONG&gt;

			Intel Atom N270 ( 1.60 GHz / 2 cores )
			1.5GB RAM
			CF to ZIF 1.8" HDD SSD IDE Adapter
			2GB Compact Flash ( CF ) Card
			Windows 95 Pan European 32-bit
			Size of L2 Cache = 512KB
			Size of L1 Cache =  24KB
			Display resolution: 800 x 600
			// Memory Settings in System.ini
			...
			[386Enh]
			;
			; MaxPhysPage value	; Amount of physical RAM Windows 95 can access
			;
			MaxPhysPage=32768	; 823336 KB	= 804 MB ( Currently Used )
			...</description>
      <pubDate>Thu, 04 Aug 2016 16:34:23 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088026#M64164</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:34:23Z</dc:date>
    </item>
    <item>
      <title>[ OSs used for performance</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088027#M64165</link>
      <description>&lt;STRONG&gt;[ OSs used for performance evaluations ]&lt;/STRONG&gt;

		Windows 95 Pan European 32-bit
		Windows 2000 Professional 32-bit SP4
		Windows XP Professional 32-bit SP3
		Windows 7 Professional 64-bit SP1</description>
      <pubDate>Thu, 04 Aug 2016 16:35:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088027#M64165</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:35:27Z</dc:date>
    </item>
    <item>
      <title>[ IDEs used for performance</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088028#M64166</link>
      <description>&lt;STRONG&gt;[ IDEs used for performance evaluations ]&lt;/STRONG&gt;

		Visual Studio 98 Professional Edition
		Visual Studio 2005 Professional Edition
		Visual Studio 2008 Professional Edition
		Visual Studio 2008 Express Edition</description>
      <pubDate>Thu, 04 Aug 2016 16:36:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088028#M64166</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:36:38Z</dc:date>
    </item>
    <item>
      <title>[ C++ compilers used for</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088029#M64167</link>
      <description>&lt;STRONG&gt;[ C++ compilers used for performance evaluations ]&lt;/STRONG&gt;

		Borland C++ compiler v5.5.1 32-bit

		MinGW C++ compiler v3.4.2 32-bit
		MinGW C++ compiler v4.8.1 32-bit
		MinGW C++ compiler v4.9.2 32-bit
		MinGW C++ compiler v4.9.2 64-bit
		MinGW C++ compiler v5.1.0 32-bit
		MinGW C++ compiler v5.1.0 64-bit
		MinGW C++ compiler v6.1.0 32-bit
		MinGW C++ compiler v6.1.0 64-bit

		Microsoft C++ compiler ( VS98 PE   ) 32-bit
		Microsoft C++ compiler ( VS2005 PE ) 32-bit
		Microsoft C++ compiler ( VS2008 PE ) 32-bit
		Microsoft C++ compiler ( VS2008 PE ) 64-bit
		Microsoft C++ compiler ( VS2008 EE ) 32-bit

		Intel C++ compiler v7.1.0 ( u029 ) 32-bit
		Intel C++ compiler v8.1.0 ( u038 ) 32-bit
		Intel C++ compiler v12.1.7 ( u371 ) 32-bit
		Intel C++ compiler v13.1.0 ( u149 ) 32-bit
		Intel C++ compiler v13.1.0 ( u149 ) 64-bit

		Watcom C++ compiler v1.9.0 32-bit
		Watcom C++ compiler v2.0.0 32-bit
		Watcom C++ compiler v2.0.0 64-bit</description>
      <pubDate>Thu, 04 Aug 2016 16:37:31 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088029#M64167</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:37:31Z</dc:date>
    </item>
    <item>
      <title>[ Base Performance</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088030#M64168</link>
      <description>&lt;STRONG&gt;[ Base Performance Evaluations with MKL SGEMM function - CPU AN 32-bit Windows 95 ]&lt;/STRONG&gt;

It is Not completed because an MKL library installation for the platform is No longer available</description>
      <pubDate>Thu, 04 Aug 2016 16:38:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088030#M64168</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:38:52Z</dc:date>
    </item>
    <item>
      <title>[ Base Performance</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088031#M64169</link>
      <description>&lt;STRONG&gt;[ Base Performance Evaluations with MKL SGEMM function - CPU P2 32-bit Windows 2000 ]&lt;/STRONG&gt;

It is Not completed because an MKL library installation for the platform is No longer available</description>
      <pubDate>Thu, 04 Aug 2016 16:39:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088031#M64169</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:39:29Z</dc:date>
    </item>
    <item>
      <title>[ Base Performance</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088032#M64170</link>
      <description>&lt;STRONG&gt;[ Base Performance Evaluations with MKL SGEMM function - CPU P4 32-bit Windows XP ]&lt;/STRONG&gt;

		Application - ScaLibTestApp - WIN32_MSC ( 32-bit ) - Release
		Tests: Start
		&amp;gt; Test1153 Start &amp;lt;
		Sub-Test 1.1 - Runtime Binding of MKL functions
		Dynamic Library mkl_rt.dll Loaded
		Initialization Done
		Sub-Test 3.2 - MKL Matrix Multiplication
		Matrix Multiplication C[ 1024x1024 ] = A[ 1024x1024 ] * B[ 1024x1024 ]
		Allocating Memory for Matrices ( 16-byte alignment )
		Intializing Matrix Data - Started
		Intializing Matrix Data - Completed
		Cblas xGEMM
		Matrix Size           :  1024 x  1024
		Matrix Size Threshold : N/A
		Matrix Partitions     : N/A
		Degree of Recursion   : N/A
		Result Sets Reflection: N/A
		Calculating...
		Cblas SGEMM  - Pass 01 - Completed:     0.53100 secs
		Cblas SGEMM  - Pass 02 - Completed:     0.51600 secs
		Cblas SGEMM  - Pass 03 - Completed:     0.51600 secs
		Cblas SGEMM  - Pass 04 - Completed:     0.51600 secs
		Cblas SGEMM  - Pass 05 - Completed:     0.51500 secs
		Cblas SGEMM - Passed
		Deallocating Memory
		Dynamic Library mkl_rt.dll Unloaded
		&amp;gt; Test1153 End &amp;lt;
		Tests: Completed

		Application - BccTestApp - WIN32_BCC ( 32-bit ) - Release
		Tests: Start
		&amp;gt; Test1153 Start &amp;lt;
		Sub-Test 1.1 - Runtime Binding of MKL functions
		Dynamic Library mkl_rt.dll Loaded
		Initialization Done
		Sub-Test 3.2 - MKL Matrix Multiplication
		Matrix Multiplication C[ 1024x1024 ] = A[ 1024x1024 ] * B[ 1024x1024 ]
		Allocating Memory for Matrices ( 16-byte alignment )
		Intializing Matrix Data - Started
		Intializing Matrix Data - Completed
		Cblas xGEMM
		Matrix Size           :  1024 x  1024
		Matrix Size Threshold : N/A
		Matrix Partitions     : N/A
		Degree of Recursion   : N/A
		Result Sets Reflection: N/A
		Calculating...
		Cblas SGEMM  - Pass 01 - Completed:     0.51500 secs
		Cblas SGEMM  - Pass 02 - Completed:     0.51500 secs
		Cblas SGEMM  - Pass 03 - Completed:     0.51600 secs
		Cblas SGEMM  - Pass 04 - Completed:     0.51600 secs
		Cblas SGEMM  - Pass 05 - Completed:     0.51600 secs
		Cblas SGEMM - Passed
		Deallocating Memory
		Dynamic Library mkl_rt.dll Unloaded
		&amp;gt; Test1153 End &amp;lt;
		Tests: Completed

		Application - IccTestApp - WIN32_ICC ( 32-bit ) - Release
		Tests: Start
		&amp;gt; Test1153 Start &amp;lt;
		Sub-Test 1.1 - Runtime Binding of MKL functions
		Dynamic Library mkl_rt.dll Loaded
		Initialization Done
		Sub-Test 3.2 - MKL Matrix Multiplication
		Matrix Multiplication C[ 1024x1024 ] = A[ 1024x1024 ] * B[ 1024x1024 ]
		Allocating Memory for Matrices ( 16-byte alignment )
		Intializing Matrix Data - Started
		Intializing Matrix Data - Completed
		Cblas xGEMM
		Matrix Size           :  1024 x  1024
		Matrix Size Threshold : N/A
		Matrix Partitions     : N/A
		Degree of Recursion   : N/A
		Result Sets Reflection: N/A
		Calculating...
		Cblas SGEMM  - Pass 01 - Completed:     0.53200 secs
		Cblas SGEMM  - Pass 02 - Completed:     0.51500 secs
		Cblas SGEMM  - Pass 03 - Completed:     0.51600 secs
		Cblas SGEMM  - Pass 04 - Completed:     0.51600 secs
		Cblas SGEMM  - Pass 05 - Completed:     0.51500 secs
		Cblas SGEMM - Passed
		Deallocating Memory
		Dynamic Library mkl_rt.dll Unloaded
		&amp;gt; Test1153 End &amp;lt;
		Tests: Completed

		Application - MgwTestApp - WIN32_MGW ( 32-bit ) - Release
		Tests: Start
		&amp;gt; Test1153 Start &amp;lt;
		Sub-Test 1.1 - Runtime Binding of MKL functions
		Dynamic Library mkl_rt.dll Loaded
		Initialization Done
		Sub-Test 3.2 - MKL Matrix Multiplication
		Matrix Multiplication C[ 1024x1024 ] = A[ 1024x1024 ] * B[ 1024x1024 ]
		Allocating Memory for Matrices ( 16-byte alignment )
		Intializing Matrix Data - Started
		Intializing Matrix Data - Completed
		Cblas xGEMM
		Matrix Size           :  1024 x  1024
		Matrix Size Threshold : N/A
		Matrix Partitions     : N/A
		Degree of Recursion   : N/A
		Result Sets Reflection: N/A
		Calculating...
		Cblas SGEMM  - Pass 01 - Completed:     0.54700 secs
		Cblas SGEMM  - Pass 02 - Completed:     0.51500 secs
		Cblas SGEMM  - Pass 03 - Completed:     0.51600 secs
		Cblas SGEMM  - Pass 04 - Completed:     0.51500 secs
		Cblas SGEMM  - Pass 05 - Completed:     0.51500 secs
		Cblas SGEMM - Passed
		Deallocating Memory
		Dynamic Library mkl_rt.dll Unloaded
		&amp;gt; Test1153 End &amp;lt;
		Tests: Completed

		Application - WccTestApp - WIN32_WCC ( 32-bit ) - Release
		Tests: Start
		&amp;gt; Test1153 Start &amp;lt;
		Sub-Test 1.1 - Runtime Binding of MKL functions
		Dynamic Library mkl_rt.dll Loaded
		Initialization Done
		Sub-Test 3.2 - MKL Matrix Multiplication
		Matrix Multiplication C[ 1024x1024 ] = A[ 1024x1024 ] * B[ 1024x1024 ]
		Allocating Memory for Matrices ( 16-byte alignment )
		Intializing Matrix Data - Started
		Intializing Matrix Data - Completed
		Cblas xGEMM
		Matrix Size           :  1024 x  1024
		Matrix Size Threshold : N/A
		Matrix Partitions     : N/A
		Degree of Recursion   : N/A
		Result Sets Reflection: N/A
		Calculating...
		Cblas SGEMM  - Pass 01 - Completed:     0.54900 secs
		Cblas SGEMM  - Pass 02 - Completed:     0.51600 secs
		Cblas SGEMM  - Pass 03 - Completed:     0.51500 secs
		Cblas SGEMM  - Pass 04 - Completed:     0.51500 secs
		Cblas SGEMM  - Pass 05 - Completed:     0.51600 secs
		Cblas SGEMM - Passed
		Deallocating Memory
		Dynamic Library mkl_rt.dll Unloaded
		&amp;gt; Test1153 End &amp;lt;
		Tests: Completed</description>
      <pubDate>Thu, 04 Aug 2016 16:40:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088032#M64170</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:40:37Z</dc:date>
    </item>
    <item>
      <title>[ Base Performance</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088033#M64171</link>
      <description>&lt;STRONG&gt;[ Base Performance Evaluations with MKL SGEMM function - CPU IB 64-bit Windows 7 ]&lt;/STRONG&gt;

		Application - ScaLibTestApp - WIN64_MSC ( 64-bit ) - Release
		Tests: Start
		&amp;gt; Test1153 Start &amp;lt;
		Sub-Test 1.1 - Runtime Binding of MKL functions
		Dynamic Library mkl_rt.dll Loaded
		Initialization Done
		Sub-Test 3.2 - MKL Matrix Multiplication
		Matrix Multiplication C[ 1024x1024 ] = A[ 1024x1024 ] * B[ 1024x1024 ]
		Allocating Memory for Matrices ( 16-byte alignment )
		Intializing Matrix Data - Started
		Intializing Matrix Data - Completed
		Cblas xGEMM
		Matrix Size           :  1024 x  1024
		Matrix Size Threshold : N/A
		Matrix Partitions     : N/A
		Degree of Recursion   : N/A
		Result Sets Reflection: N/A
		Calculating...
		Cblas SGEMM  - Pass 01 - Completed:     0.06100 secs
		Cblas SGEMM  - Pass 02 - Completed:     0.06600 secs
		Cblas SGEMM  - Pass 03 - Completed:     0.06600 secs
		Cblas SGEMM  - Pass 04 - Completed:     0.06600 secs
		Cblas SGEMM  - Pass 05 - Completed:     0.06500 secs
		Cblas SGEMM - Passed
		Deallocating Memory
		Dynamic Library mkl_rt.dll Unloaded
		&amp;gt; Test1153 End &amp;lt;
		Tests: Completed

		Application - BccTestApp - WIN32_BCC ( 32-bit ) - Release
		Tests: Start
		&amp;gt; Test1153 Start &amp;lt;
		Sub-Test 1.1 - Runtime Binding of MKL functions
		Dynamic Library mkl_rt.dll Loaded
		Initialization Done
		Sub-Test 3.2 - MKL Matrix Multiplication
		Matrix Multiplication C[ 1024x1024 ] = A[ 1024x1024 ] * B[ 1024x1024 ]
		Allocating Memory for Matrices ( 16-byte alignment )
		Intializing Matrix Data - Started
		Intializing Matrix Data - Completed
		Cblas xGEMM
		Matrix Size           :  1024 x  1024
		Matrix Size Threshold : N/A
		Matrix Partitions     : N/A
		Degree of Recursion   : N/A
		Result Sets Reflection: N/A
		Calculating...
		Cblas SGEMM  - Pass 01 - Completed:     0.06500 secs
		Cblas SGEMM  - Pass 02 - Completed:     0.06500 secs
		Cblas SGEMM  - Pass 03 - Completed:     0.06600 secs
		Cblas SGEMM  - Pass 04 - Completed:     0.06600 secs
		Cblas SGEMM  - Pass 05 - Completed:     0.06600 secs
		Cblas SGEMM - Passed
		Deallocating Memory
		Dynamic Library mkl_rt.dll Unloaded
		&amp;gt; Test1153 End &amp;lt;
		Tests: Completed

		Application - IccTestApp - WIN64_ICC ( 64-bit ) - Release
		Tests: Start
		&amp;gt; Test1153 Start &amp;lt;
		Sub-Test 1.1 - Runtime Binding of MKL functions
		Dynamic Library mkl_rt.dll Loaded
		Initialization Done
		Sub-Test 3.2 - MKL Matrix Multiplication
		Matrix Multiplication C[ 1024x1024 ] = A[ 1024x1024 ] * B[ 1024x1024 ]
		Allocating Memory for Matrices ( 16-byte alignment )
		Intializing Matrix Data - Started
		Intializing Matrix Data - Completed
		Cblas xGEMM
		Matrix Size           :  1024 x  1024
		Matrix Size Threshold : N/A
		Matrix Partitions     : N/A
		Degree of Recursion   : N/A
		Result Sets Reflection: N/A
		Calculating...
		Cblas SGEMM  - Pass 01 - Completed:     0.06200 secs
		Cblas SGEMM  - Pass 02 - Completed:     0.06500 secs
		Cblas SGEMM  - Pass 03 - Completed:     0.06600 secs
		Cblas SGEMM  - Pass 04 - Completed:     0.06600 secs
		Cblas SGEMM  - Pass 05 - Completed:     0.06500 secs
		Cblas SGEMM - Passed
		Deallocating Memory
		Dynamic Library mkl_rt.dll Unloaded
		&amp;gt; Test1153 End &amp;lt;
		Tests: Completed

		Application - MgwTestApp - WIN64_MGW ( 64-bit ) - Release
		Tests: Start
		&amp;gt; Test1153 Start &amp;lt;
		Sub-Test 1.1 - Runtime Binding of MKL functions
		Dynamic Library mkl_rt.dll Loaded
		Initialization Done
		Sub-Test 3.2 - MKL Matrix Multiplication
		Matrix Multiplication C[ 1024x1024 ] = A[ 1024x1024 ] * B[ 1024x1024 ]
		Allocating Memory for Matrices ( 16-byte alignment )
		Intializing Matrix Data - Started
		Intializing Matrix Data - Completed
		Cblas xGEMM
		Matrix Size           :  1024 x  1024
		Matrix Size Threshold : N/A
		Matrix Partitions     : N/A
		Degree of Recursion   : N/A
		Result Sets Reflection: N/A
		Calculating...
		Cblas SGEMM  - Pass 01 - Completed:     0.06700 secs
		Cblas SGEMM  - Pass 02 - Completed:     0.06500 secs
		Cblas SGEMM  - Pass 03 - Completed:     0.06600 secs
		Cblas SGEMM  - Pass 04 - Completed:     0.06500 secs
		Cblas SGEMM  - Pass 05 - Completed:     0.06500 secs
		Cblas SGEMM - Passed
		Deallocating Memory
		Dynamic Library mkl_rt.dll Unloaded
		&amp;gt; Test1153 End &amp;lt;
		Tests: Completed

		Application - WccTestApp - WIN32_WCC ( 32-bit ) - Release
		Tests: Start
		&amp;gt; Test1153 Start &amp;lt;
		Sub-Test 1.1 - Runtime Binding of MKL functions
		Dynamic Library mkl_rt.dll Loaded
		Initialization Done
		Sub-Test 3.2 - MKL Matrix Multiplication
		Matrix Multiplication C[ 1024x1024 ] = A[ 1024x1024 ] * B[ 1024x1024 ]
		Allocating Memory for Matrices ( 16-byte alignment )
		Intializing Matrix Data - Started
		Intializing Matrix Data - Completed
		Cblas xGEMM
		Matrix Size           :  1024 x  1024
		Matrix Size Threshold : N/A
		Matrix Partitions     : N/A
		Degree of Recursion   : N/A
		Result Sets Reflection: N/A
		Calculating...
		Cblas SGEMM  - Pass 01 - Completed:     0.06900 secs
		Cblas SGEMM  - Pass 02 - Completed:     0.06600 secs
		Cblas SGEMM  - Pass 03 - Completed:     0.06500 secs
		Cblas SGEMM  - Pass 04 - Completed:     0.06500 secs
		Cblas SGEMM  - Pass 05 - Completed:     0.06600 secs
		Cblas SGEMM - Passed
		Deallocating Memory
		Dynamic Library mkl_rt.dll Unloaded
		&amp;gt; Test1153 End &amp;lt;
		Tests: Completed</description>
      <pubDate>Thu, 04 Aug 2016 16:41:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088033#M64171</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:41:35Z</dc:date>
    </item>
    <item>
      <title>[ Microsoft C++ compiler (</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088034#M64172</link>
      <description>&lt;STRONG&gt;[ Microsoft C++ compiler ( VS98 PE ) - Release - 32-bit ( LPS: IJK ) - CPU AN 32-bit Windows 95 ]&lt;/STRONG&gt;

		Application - ScaLibTestApp - WIN32_MSC ( 32-bit ) - Release
		Tests: Start
		&amp;gt; Test1099 Start &amp;lt;
		Matrix A, B and C Sizes       :  1024 x  1024
		Loop Processing Schema ( LPS ): IJK
		Loop Blocking Divider         : 1
		Sub-Test 1.1 - MxMultA1 - Classic 2D
			LBOT size: N/A
			Completed:   140.56801 secs
		Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT
			LBOT size: 1024x1024 elements
			Completed:   136.45601 secs
		Sub-Test 1.3 - MxMultA3 - Classic 2D Fused
			LBOT size: N/A
			Completed:   145.31301 secs
		Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT
			LBOT size: 1024x1024 elements
			Completed:   142.82801 secs
		Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed
			LBOT size: N/A
			Completed:     5.08100 secs
		Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT
			LBOT size: 1024x1024 elements
			Completed:     5.31400 secs
		Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed
			LBOT size: N/A
			Completed:     5.61700 secs
		Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT
			LBOT size: 1024x1024 elements
			Completed:     5.94600 secs
		Sub-Test 5.1 - MxMultD1 - Classic 1D
			LBOT size: N/A
			Completed:   136.55101 secs
		Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT
			LBOT size: 1024x1024 elements
			Completed:   136.57901 secs
		&amp;gt; Test1099 End &amp;lt;
		Tests: Completed</description>
      <pubDate>Thu, 04 Aug 2016 16:42:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088034#M64172</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:42:39Z</dc:date>
    </item>
    <item>
      <title>[ Microsoft C++ compiler (</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088035#M64173</link>
      <description>&lt;STRONG&gt;[ Microsoft C++ compiler ( VS98 PE ) - Release - 32-bit ( LPS: IKJ ) - CPU AN 32-bit Windows 95 ]&lt;/STRONG&gt;

		Application - ScaLibTestApp - WIN32_MSC ( 32-bit ) - Release
		Tests: Start
		&amp;gt; Test1099 Start &amp;lt;
		Matrix A, B and C Sizes       :  1024 x  1024
		Loop Processing Schema ( LPS ): IKJ
		Loop Blocking Divider         : 1
		Sub-Test 1.1 - MxMultA1 - Classic 2D
			LBOT size: N/A
			Completed:     9.87500 secs
		Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT
			LBOT size: 1024x1024 elements
			Completed:     9.44900 secs
		Sub-Test 1.3 - MxMultA3 - Classic 2D Fused
			LBOT size: N/A
			Completed:     9.73700 secs
		Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT
			LBOT size: 1024x1024 elements
			Completed:     9.75100 secs
		Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed
			LBOT size: N/A
			Completed:   147.64801 secs
		Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT
			LBOT size: 1024x1024 elements
			Completed:   147.68901 secs
		Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed
			LBOT size: N/A
			Completed:   146.48101 secs
		Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT
			LBOT size: 1024x1024 elements
			Completed:   154.74801 secs
		Sub-Test 5.1 - MxMultD1 - Classic 1D
			LBOT size: N/A
			Completed:     9.44800 secs
		Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT
			LBOT size: 1024x1024 elements
			Completed:     9.46300 secs
		&amp;gt; Test1099 End &amp;lt;
		Tests: Completed</description>
      <pubDate>Thu, 04 Aug 2016 16:43:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088035#M64173</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:43:26Z</dc:date>
    </item>
    <item>
      <title>[ Microsoft C++ compiler (</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088036#M64174</link>
      <description>&lt;STRONG&gt;[ Microsoft C++ compiler ( VS98 PE ) - Release - 32-bit ( LPS: IJK ) - CPU P2 32-bit Windows 2000 ]&lt;/STRONG&gt;

		Application - ScaLibTestApp - WIN32_MSC ( 32-bit ) - Release
		Tests: Start
		&amp;gt; Test1099 Start &amp;lt;
		Matrix A, B and C Sizes       :  1024 x  1024
		Loop Processing Schema ( LPS ): IJK
		Loop Blocking Divider         : 1
		Sub-Test 1.1 - MxMultA1 - Classic 2D
		        LBOT size: N/A
        		Completed:   253.86501 secs
		Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT
        		LBOT size: 1024x1024 elements
		        Completed:   253.85501 secs
		Sub-Test 1.3 - MxMultA3 - Classic 2D Fused
        		LBOT size: N/A
		        Completed:   256.85901 secs
		Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT
        		LBOT size: 1024x1024 elements
        		Completed:   257.74001 secs
		Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed
        		LBOT size: N/A
        		Completed:    48.61000 secs
		Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT
        		LBOT size: 1024x1024 elements
        		Completed:    59.95600 secs
		Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed
        		LBOT size: N/A
        		Completed:    72.07300 secs
		Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT
        		LBOT size: 1024x1024 elements
        		Completed:    72.43400 secs
		Sub-Test 5.1 - MxMultD1 - Classic 1D
        		LBOT size: N/A
        		Completed:   258.42101 secs
		Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT
        		LBOT size: 1024x1024 elements
        		Completed:   258.35201 secs
		&amp;gt; Test1099 End &amp;lt;
		Tests: Completed</description>
      <pubDate>Thu, 04 Aug 2016 16:54:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088036#M64174</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:54:39Z</dc:date>
    </item>
    <item>
      <title>+	// [ Intel C++ compiler v7</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088037#M64175</link>
      <description>&lt;STRONG&gt;[ Intel C++ compiler v7.1.0 ( u029 ) - Release - 32-bit ( LPS: IJK ) - CPU P2 32-bit Windows 2000 ]&lt;/STRONG&gt;

Application - IccTestApp - WIN32_ICC ( 32-bit ) - Release
 Tests: Start
 &amp;gt; Test1099 Start &amp;lt;
 Matrix A, B and C Sizes : 1024 x 1024
 Loop Processing Schema ( LPS ): IJK
 Loop Blocking Divider : 1
 Sub-Test 1.1 - MxMultA1 - Classic 2D
 LBOT size: N/A
 Completed: 254.23501 secs
 Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT
 LBOT size: 1024x1024 elements
 Completed: 281.93501 secs
 Sub-Test 1.3 - MxMultA3 - Classic 2D Fused
 LBOT size: N/A
 Completed: 254.79601 secs
 Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT
 LBOT size: 1024x1024 elements
 Completed: 255.33701 secs
 Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed
 LBOT size: N/A
 Completed: 47.97900 secs
 Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT
 LBOT size: 1024x1024 elements
 Completed: 60.25600 secs
 Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed
 LBOT size: N/A
 Completed: 72.31400 secs
 Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT
 LBOT size: 1024x1024 elements
 Completed: 72.74500 secs
 Sub-Test 5.1 - MxMultD1 - Classic 1D
 LBOT size: N/A
 Completed: 272.31201 secs
 Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT
 LBOT size: 1024x1024 elements
 Completed: 273.65301 secs
 &amp;gt; Test1099 End &amp;lt;
 Tests: Completed</description>
      <pubDate>Thu, 04 Aug 2016 16:55:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088037#M64175</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:55:00Z</dc:date>
    </item>
    <item>
      <title>[ Microsoft C++ compiler (</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088038#M64176</link>
      <description>&lt;STRONG&gt;[ Microsoft C++ compiler ( VS98 PE ) - Release - 32-bit ( LPS: IKJ ) - CPU P2 32-bit Windows 2000 ]&lt;/STRONG&gt;

		Application - ScaLibTestApp - WIN32_MSC ( 32-bit ) - Release
		Tests: Start
		&amp;gt; Test1099 Start &amp;lt;
		Matrix A, B and C Sizes       :  1024 x  1024
		Loop Processing Schema ( LPS ): IKJ
		Loop Blocking Divider         : 1
		Sub-Test 1.1 - MxMultA1 - Classic 2D
        		LBOT size: N/A
        		Completed:    59.51500 secs
		Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT
		        LBOT size: 1024x1024 elements
        		Completed:    59.54500 secs
		Sub-Test 1.3 - MxMultA3 - Classic 2D Fused
		        LBOT size: N/A
        		Completed:    98.13100 secs
		Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT
        		LBOT size: 1024x1024 elements
		        Completed:    98.14100 secs
		Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed
        		LBOT size: N/A
		        Completed:   254.30601 secs
		Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT
        		LBOT size: 1024x1024 elements
		        Completed:   254.62601 secs
		Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed
        		LBOT size: N/A
		        Completed:   256.21801 secs
		Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT
        		LBOT size: 1024x1024 elements
		        Completed:   255.96901 secs
		Sub-Test 5.1 - MxMultD1 - Classic 1D
        		LBOT size: N/A
		        Completed:    59.69600 secs
		Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT
        		LBOT size: 1024x1024 elements
		        Completed:    59.68600 secs
		&amp;gt; Test1099 End &amp;lt;
		Tests: Completed</description>
      <pubDate>Thu, 04 Aug 2016 16:55:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088038#M64176</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:55:21Z</dc:date>
    </item>
    <item>
      <title>+	// [ Intel C++ compiler v7</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088039#M64177</link>
      <description>&lt;STRONG&gt;[ Intel C++ compiler v7.1.0 ( u029 ) - Release - 32-bit ( LPS: IKJ ) - CPU P2 32-bit Windows 2000 ]&lt;/STRONG&gt;

Application - IccTestApp - WIN32_ICC ( 32-bit ) - Release
 Tests: Start
 &amp;gt; Test1099 Start &amp;lt;
 Matrix A, B and C Sizes : 1024 x 1024
 Loop Processing Schema ( LPS ): IKJ
 Loop Blocking Divider : 1
 Sub-Test 1.1 - MxMultA1 - Classic 2D
 LBOT size: N/A
 Completed: 60.21600 secs
 Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT
 LBOT size: 1024x1024 elements
 Completed: 59.84600 secs
 Sub-Test 1.3 - MxMultA3 - Classic 2D Fused
 LBOT size: N/A
 Completed: 72.53500 secs
 Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT
 LBOT size: 1024x1024 elements
 Completed: 72.52500 secs
 Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed
 LBOT size: N/A
 Completed: 254.90701 secs
 Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT
 LBOT size: 1024x1024 elements
 Completed: 254.93701 secs
 Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed
 LBOT size: N/A
 Completed: 256.24901 secs
 Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT
 LBOT size: 1024x1024 elements
 Completed: 256.48901 secs
 Sub-Test 5.1 - MxMultD1 - Classic 1D
 LBOT size: N/A
 Completed: 59.45600 secs
 Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT
 LBOT size: 1024x1024 elements
 Completed: 59.48500 secs
 &amp;gt; Test1099 End &amp;lt;
 Tests: Completed</description>
      <pubDate>Thu, 04 Aug 2016 16:56:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088039#M64177</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:56:00Z</dc:date>
    </item>
    <item>
      <title>+	// [ Intel C++ compiler v8</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088040#M64178</link>
      <description>&lt;STRONG&gt;[ Intel C++ compiler v8.1.0 ( u038 ) - Release - 32-bit ( LPS: IJK ) - CPU P2 32-bit Windows 2000 ]&lt;/STRONG&gt;

Application - IccTestApp - WIN32_ICC ( 32-bit ) - Release
 Tests: Start
 &amp;gt; Test1099 Start &amp;lt;
 Matrix A, B and C Sizes : 1024 x 1024
 Loop Processing Schema ( LPS ): IJK
 Loop Blocking Divider : 1
 Sub-Test 1.1 - MxMultA1 - Classic 2D
 LBOT size: N/A
 Completed: 253.37400 secs
 Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT
 LBOT size: 1024x1024 elements
 Completed: 253.12400 secs
 Sub-Test 1.3 - MxMultA3 - Classic 2D Fused
 LBOT size: N/A
 Completed: 254.65600 secs
 Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT
 LBOT size: 1024x1024 elements
 Completed: 255.29700 secs
 Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed
 LBOT size: N/A
 Completed: 47.44800 secs
 Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT
 LBOT size: 1024x1024 elements
 Completed: 48.89000 secs
 Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed
 LBOT size: N/A
 Completed: 72.33400 secs
 Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT
 LBOT size: 1024x1024 elements
 Completed: 72.35400 secs
 Sub-Test 5.1 - MxMultD1 - Classic 1D
 LBOT size: N/A
 Completed: 249.90900 secs
 Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT
 LBOT size: 1024x1024 elements
 Completed: 249.90900 secs
 &amp;gt; Test1099 End &amp;lt;
 Tests: Completed</description>
      <pubDate>Thu, 04 Aug 2016 16:57:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088040#M64178</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T16:57:00Z</dc:date>
    </item>
    <item>
      <title>[ Intel C++ compiler v8.1.0 (</title>
      <link>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088041#M64179</link>
      <description>&lt;STRONG&gt;[ Intel C++ compiler v8.1.0 ( u038 ) - Release - 32-bit ( LPS: IKJ ) - CPU P2 32-bit Windows 2000 ]&lt;/STRONG&gt;

		Application - IccTestApp - WIN32_ICC ( 32-bit ) - Release
		Tests: Start
		&amp;gt; Test1099 Start &amp;lt;
		Matrix A, B and C Sizes       :  1024 x  1024
		Loop Processing Schema ( LPS ): IKJ
		Loop Blocking Divider         : 1
		Sub-Test 1.1 - MxMultA1 - Classic 2D
		        LBOT size: N/A
		        Completed:    60.24600 secs
		Sub-Test 1.2 - MxMultA2 - Classic 2D LBOT
		        LBOT size: 1024x1024 elements
		        Completed:    59.78600 secs
		Sub-Test 1.3 - MxMultA3 - Classic 2D Fused
		        LBOT size: N/A
		        Completed:    79.53400 secs
		Sub-Test 1.4 - MxMultA4 - Classic 2D Fused LBOT
		        LBOT size: 1024x1024 elements
		        Completed:    79.54400 secs
		Sub-Test 2.1 - MxMultB1 - Classic 2D Transposed
		        LBOT size: N/A
		        Completed:   253.84500 secs
		Sub-Test 2.2 - MxMultB2 - Classic 2D Transposed LBOT
		        LBOT size: 1024x1024 elements
		        Completed:   254.01600 secs
		Sub-Test 2.3 - MxMultB3 - Classic 2D Fused Transposed
		        LBOT size: N/A
		        Completed:   255.91800 secs
		Sub-Test 2.4 - MxMultB4 - Classic 2D Fused Transposed LBOT
		        LBOT size: 1024x1024 elements
		        Completed:   255.87800 secs
		Sub-Test 5.1 - MxMultD1 - Classic 1D
		        LBOT size: N/A
		        Completed:    59.30500 secs
		Sub-Test 5.2 - MxMultD2 - Classic 1D LBOT
		        LBOT size: 1024x1024 elements
		        Completed:    59.29500 secs
		&amp;gt; Test1099 End &amp;lt;
		Tests: Completed</description>
      <pubDate>Thu, 04 Aug 2016 17:03:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Performance-Evaluation-of-Classic-Matrix-Multiplication/m-p/1088041#M64179</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2016-08-04T17:03:04Z</dc:date>
    </item>
  </channel>
</rss>

