<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic why is cblas_sgemm 5 times slower than cblas_dgemm in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/why-is-cblas-sgemm-5-times-slower-than-cblas-dgemm/m-p/879410#M9345</link>
    <description>hi, all, &lt;BR /&gt;&lt;BR /&gt;i am experiencing a weird problem using MKL 10.0.2 under Visual Studio 2005/2008 express edition. &lt;BR /&gt;&lt;BR /&gt;So, i am trying to use cblas_sgemm/dgemm to do a matrix multiplication as follows:&lt;BR /&gt;&lt;BR /&gt;Matrix A (m*n), where m is around 50000, n is around 50.&lt;BR /&gt;Matrix B (m*n).&lt;BR /&gt;matrix C (n*n)&lt;BR /&gt;&lt;BR /&gt;i need to do C=A-&amp;gt;transpose * B&lt;BR /&gt;&lt;BR /&gt;so i wrote &lt;BR /&gt; cblas_sgemm(CblasColMajor, CblasTrans, CblasNoTrans, n, n, m, 1.0f, A, m, B, m, 0.0f, C, n);&lt;BR /&gt;&lt;BR /&gt;and the same with double precision&lt;BR /&gt;&lt;BR /&gt; cblas_dgemm(CblasColMajor, CblasTrans, CblasNoTrans, n, n, m, 1.0f, A, m, B, m, 0.0f, C, n);&lt;BR /&gt;&lt;BR /&gt;basically they both work in terms of giving the right output as desired. However, when I use the sgemm with ABC as float*, it runs 5 times slower than using dgemm with ABC as double*..&lt;BR /&gt;&lt;BR /&gt;could anybody help check this out????? thank you very very much !!!!!&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Wed, 12 Dec 2007 15:01:10 GMT</pubDate>
    <dc:creator>missspicyfood</dc:creator>
    <dc:date>2007-12-12T15:01:10Z</dc:date>
    <item>
      <title>why is cblas_sgemm 5 times slower than cblas_dgemm</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/why-is-cblas-sgemm-5-times-slower-than-cblas-dgemm/m-p/879410#M9345</link>
      <description>hi, all, &lt;BR /&gt;&lt;BR /&gt;i am experiencing a weird problem using MKL 10.0.2 under Visual Studio 2005/2008 express edition. &lt;BR /&gt;&lt;BR /&gt;So, i am trying to use cblas_sgemm/dgemm to do a matrix multiplication as follows:&lt;BR /&gt;&lt;BR /&gt;Matrix A (m*n), where m is around 50000, n is around 50.&lt;BR /&gt;Matrix B (m*n).&lt;BR /&gt;matrix C (n*n)&lt;BR /&gt;&lt;BR /&gt;i need to do C=A-&amp;gt;transpose * B&lt;BR /&gt;&lt;BR /&gt;so i wrote &lt;BR /&gt; cblas_sgemm(CblasColMajor, CblasTrans, CblasNoTrans, n, n, m, 1.0f, A, m, B, m, 0.0f, C, n);&lt;BR /&gt;&lt;BR /&gt;and the same with double precision&lt;BR /&gt;&lt;BR /&gt; cblas_dgemm(CblasColMajor, CblasTrans, CblasNoTrans, n, n, m, 1.0f, A, m, B, m, 0.0f, C, n);&lt;BR /&gt;&lt;BR /&gt;basically they both work in terms of giving the right output as desired. However, when I use the sgemm with ABC as float*, it runs 5 times slower than using dgemm with ABC as double*..&lt;BR /&gt;&lt;BR /&gt;could anybody help check this out????? thank you very very much !!!!!&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 12 Dec 2007 15:01:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/why-is-cblas-sgemm-5-times-slower-than-cblas-dgemm/m-p/879410#M9345</guid>
      <dc:creator>missspicyfood</dc:creator>
      <dc:date>2007-12-12T15:01:10Z</dc:date>
    </item>
  </channel>
</rss>

