<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Problem with hpcc, intel mpi and intel mkl: PTRANS failed in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Problem-with-hpcc-intel-mpi-and-intel-mkl-PTRANS-failed/m-p/813820#M4138</link>
    <description>Hi,&lt;BR /&gt;&lt;BR /&gt;We have a little cluster (4 nodes; each node 12 cores). I'm trying to test hpcc on it. So I have read:&lt;BR /&gt;&lt;A href="http://origin-software.intel.com/en-us/articles/performance-tools-for-software-developers-use-of-intel-mkl-in-hpcc-benchmark/" target="_blank"&gt;http://origin-software.intel.com/en-us/articles/performance-tools-for-software-developers-use-of-intel-mkl-in-hpcc-benchmark/&lt;/A&gt;&lt;BR /&gt;and have done step by step the things. My make.arch is:&lt;BR /&gt; &lt;BR /&gt;# &lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - shell --------------------------------------------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;SHELL = /bin/sh&lt;BR /&gt;#&lt;BR /&gt;CD = cd&lt;BR /&gt;CP = cp&lt;BR /&gt;LN_S = ln -s&lt;BR /&gt;MKDIR = mkdir&lt;BR /&gt;RM = /bin/rm -f&lt;BR /&gt;TOUCH = touch&lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - Platform identifier ------------------------------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;ARCH = $(arch)&lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - HPL Directory Structure / HPL library ------------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;TOPdir = ../../..&lt;BR /&gt;INCdir = $(TOPdir)/include&lt;BR /&gt;BINdir = $(TOPdir)/bin/$(ARCH)&lt;BR /&gt;LIBdir = $(TOPdir)/lib/$(ARCH)&lt;BR /&gt;#&lt;BR /&gt;HPLlib = $(LIBdir)/libhpl.a &lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - Message Passing library (MPI) --------------------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# MPinc tells the C compiler where to find the Message Passing library&lt;BR /&gt;# header files, MPlib is defined to be the name of the library to be &lt;BR /&gt;# used. The variable MPdir is only used for defining MPinc and MPlib.&lt;BR /&gt;#&lt;BR /&gt;MPdir = /opt/intel/impi/4.0.0&lt;BR /&gt;MPinc = -I$(MPdir)/include64&lt;BR /&gt;MPlib = -L$(MPdir)/lib64&lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - Linear Algebra library (BLAS or VSIPL) -----------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# LAinc tells the C compiler where to find the Linear Algebra library&lt;BR /&gt;# header files, LAlib is defined to be the name of the library to be &lt;BR /&gt;# used. The variable LAdir is only used for defining LAinc and LAlib.&lt;BR /&gt;#&lt;BR /&gt;LAdir = /opt/intel/mkl/10.2.5.035/lib/em64t&lt;BR /&gt;LAdir_local = ~/tmp/flops_test/mkl_local/lib/em64t&lt;BR /&gt;LAinc = -I/opt/intel/mkl/10.2.5.035/include -I/opt/intel/mkl/10.2.5.035/include/fftw&lt;BR /&gt;LAlib = $(LAdir)/libmkl_solver_ilp64.a -Wl,--start-group $(LAdir)/libmkl_intel_ilp64.a $(LAdir)/libmkl_intel_thread.a $(LAdir)/libmkl_core.a $(LAdir)/libmkl_blacs_intelmpi_ilp64.a $(LAdir_local)/libfftw2x_cdft_DOUBLE.a $(LAdir_local)/libfftw2xc_intel.a $(LAdir)/libmkl_cdft_core.a -Wl,--end-group -openmp -lpthread&lt;BR /&gt;&lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - F77 / C interface --------------------------------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;F2CDEFS = -DAdd_ -DF77_INTEGER=int -DStringSunStyle &lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - HPL includes / libraries / specifics -------------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc)&lt;BR /&gt;HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib) -lm&lt;BR /&gt;#&lt;BR /&gt;# - Compile time options -----------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;HPL_OPTS = -DUSING_FFTW -DMKL_INT=long -DLONG_IS_64BITS&lt;BR /&gt;# &lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;HPL_DEFS = $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES) &lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - Compilers / linkers - Optimization flags ---------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;CC = mpicc&lt;BR /&gt;CCNOOPT = $(HPL_DEFS) &lt;BR /&gt;CCFLAGS = $(HPL_DEFS) -O2 -xSSE4.2 -ansi-alias -ip&lt;BR /&gt;#&lt;BR /&gt;LINKER = mpicc&lt;BR /&gt;LINKFLAGS = &lt;BR /&gt;#&lt;BR /&gt;ARCHIVER = ar&lt;BR /&gt;ARFLAGS = r&lt;BR /&gt;RANLIB = echo&lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;It compiles without error. Then I take the input file and starts the program with mpirun -np 4 hpcc on our master. I get:&lt;BR /&gt;mpirun -np 8 hpcc&lt;BR /&gt;&lt;BR /&gt;WARNING: Unable to read mpd.hosts or list of hosts isn't provided. MPI job will be run on the current machine only.&lt;BR /&gt;rank 6 in job 1 master_34154 caused collective abort of all ranks&lt;BR /&gt; exit status of rank 6: killed by signal 9&lt;BR /&gt;rank 0 in job 1 master_34154 caused collective abort of all ranks&lt;BR /&gt; exit status of rank 0: killed by signal 11&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;and in the output file is:&lt;BR /&gt;########################################################################&lt;BR /&gt;This is the DARPA/DOE HPC Challenge Benchmark version 1.4.1 October 2003&lt;BR /&gt;Produced by Jack Dongarra and Piotr Luszczek&lt;BR /&gt;Innovative Computing Laboratory&lt;BR /&gt;University of Tennessee Knoxville and Oak Ridge National Laboratory&lt;BR /&gt;&lt;BR /&gt;See the source files for authors of specific codes.&lt;BR /&gt;Compiled on Oct 1 2010 at 09:16:38&lt;BR /&gt;Current time (1285919752) is Fri Oct 1 09:55:52 2010&lt;BR /&gt;&lt;BR /&gt;Hostname: 'master'&lt;BR /&gt;########################################################################&lt;BR /&gt;================================================================================&lt;BR /&gt;HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008&lt;BR /&gt;Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK&lt;BR /&gt;Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK&lt;BR /&gt;Modified by Julien Langou, University of Colorado Denver&lt;BR /&gt;================================================================================&lt;BR /&gt;&lt;BR /&gt;An explanation of the input/output parameters follows:&lt;BR /&gt;T/V : Wall time / encoded variant.&lt;BR /&gt;N : The order of the coefficient matrix A.&lt;BR /&gt;NB : The partitioning blocking factor.&lt;BR /&gt;P : The number of process rows.&lt;BR /&gt;Q : The number of process columns.&lt;BR /&gt;Time : Time in seconds to solve the linear system.&lt;BR /&gt;Gflops : Rate of execution for solving the linear system.&lt;BR /&gt;&lt;BR /&gt;The following parameter values will be used:&lt;BR /&gt;&lt;BR /&gt;N : 10240 &lt;BR /&gt;NB : 128 &lt;BR /&gt;PMAP : Row-major process mapping&lt;BR /&gt;P : 2 &lt;BR /&gt;Q : 4 &lt;BR /&gt;PFACT : Right &lt;BR /&gt;NBMIN : 4 &lt;BR /&gt;NDIV : 2 &lt;BR /&gt;RFACT : Crout &lt;BR /&gt;BCAST : 1ringM &lt;BR /&gt;DEPTH : 1 &lt;BR /&gt;SWAP : Mix (threshold = 64)&lt;BR /&gt;L1 : transposed form&lt;BR /&gt;U : transposed form&lt;BR /&gt;EQUIL : yes&lt;BR /&gt;ALIGN : 8 double precision words&lt;BR /&gt;&lt;BR /&gt;--------------------------------------------------------------------------------&lt;BR /&gt;&lt;BR /&gt;- The matrix A is randomly generated for each test.&lt;BR /&gt;- The following scaled residual check will be computed:&lt;BR /&gt; ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )&lt;BR /&gt;- The relative machine precision (eps) is taken to be 2.220446e-16&lt;BR /&gt;- Computational tests pass if scaled residuals are less than 16.0&lt;BR /&gt;&lt;BR /&gt;Begin of MPIRandomAccess section.&lt;BR /&gt;Running on 8 processors (PowerofTwo)&lt;BR /&gt;Total Main table size = 2^26 = 67108864 words&lt;BR /&gt;PE Main table size = 2^23 = 8388608 words/PE&lt;BR /&gt;Default number of updates (RECOMMENDED) = 268435456&lt;BR /&gt;CPU time used = 10.461410 seconds&lt;BR /&gt;Real time used = 18.035589 seconds&lt;BR /&gt;0.014883653 Billion(10^9) Updates per second [GUP/s]&lt;BR /&gt;0.001860457 Billion(10^9) Updates/PE per second [GUP/s]&lt;BR /&gt;Verification: CPU time used = 1.407786 seconds&lt;BR /&gt;Verification: Real time used = 1.413994 seconds&lt;BR /&gt;Found 0 errors in 67108864 locations (passed).&lt;BR /&gt;Current time (1285919771) is Fri Oct 1 09:56:11 2010&lt;BR /&gt;&lt;BR /&gt;End of MPIRandomAccess section.&lt;BR /&gt;Begin of StarRandomAccess section.&lt;BR /&gt;Main table size = 2^23 = 8388608 words&lt;BR /&gt;Number of updates = 33554432&lt;BR /&gt;CPU time used = 1.022845 seconds&lt;BR /&gt;Real time used = 1.023517 seconds&lt;BR /&gt;0.032783467 Billion(10^9) Updates per second [GUP/s]&lt;BR /&gt;Found 0 errors in 8388608 locations (passed).&lt;BR /&gt;Node(s) with error 0&lt;BR /&gt;Minimum GUP/s 0.032325&lt;BR /&gt;Average GUP/s 0.032776&lt;BR /&gt;Maximum GUP/s 0.033038&lt;BR /&gt;Current time (1285919773) is Fri Oct 1 09:56:13 2010&lt;BR /&gt;&lt;BR /&gt;End of StarRandomAccess section.&lt;BR /&gt;Begin of SingleRandomAccess section.&lt;BR /&gt;Node(s) with error 0&lt;BR /&gt;Node selected 2&lt;BR /&gt;Single GUP/s 0.050258&lt;BR /&gt;Current time (1285919775) is Fri Oct 1 09:56:15 2010&lt;BR /&gt;&lt;BR /&gt;End of SingleRandomAccess section.&lt;BR /&gt;Begin of MPIRandomAccess_LCG section.&lt;BR /&gt;Running on 8 processors (PowerofTwo)&lt;BR /&gt;Total Main table size = 2^26 = 67108864 words&lt;BR /&gt;PE Main table size = 2^23 = 8388608 words/PE&lt;BR /&gt;Default number of updates (RECOMMENDED) = 268435456&lt;BR /&gt;CPU time used = 11.008327 seconds&lt;BR /&gt;Real time used = 18.597349 seconds&lt;BR /&gt;0.014434071 Billion(10^9) Updates per second [GUP/s]&lt;BR /&gt;0.001804259 Billion(10^9) Updates/PE per second [GUP/s]&lt;BR /&gt;Verification: CPU time used = 1.382789 seconds&lt;BR /&gt;Verification: Real time used = 1.386738 seconds&lt;BR /&gt;Found 0 errors in 67108864 locations (passed).&lt;BR /&gt;Current time (1285919795) is Fri Oct 1 09:56:35 2010&lt;BR /&gt;&lt;BR /&gt;End of MPIRandomAccess_LCG section.&lt;BR /&gt;Begin of StarRandomAccess_LCG section.&lt;BR /&gt;Main table size = 2^23 = 8388608 words&lt;BR /&gt;Number of updates = 33554432&lt;BR /&gt;CPU time used = 1.036842 seconds&lt;BR /&gt;Real time used = 1.037567 seconds&lt;BR /&gt;0.032339536 Billion(10^9) Updates per second [GUP/s]&lt;BR /&gt;Found 0 errors in 8388608 locations (passed).&lt;BR /&gt;Node(s) with error 0&lt;BR /&gt;Minimum GUP/s 0.032164&lt;BR /&gt;Average GUP/s 0.032349&lt;BR /&gt;Maximum GUP/s 0.032528&lt;BR /&gt;Current time (1285919797) is Fri Oct 1 09:56:37 2010&lt;BR /&gt;&lt;BR /&gt;End of StarRandomAccess_LCG section.&lt;BR /&gt;Begin of SingleRandomAccess_LCG section.&lt;BR /&gt;Node(s) with error 0&lt;BR /&gt;Node selected 7&lt;BR /&gt;Single GUP/s 0.048609&lt;BR /&gt;Current time (1285919798) is Fri Oct 1 09:56:38 2010&lt;BR /&gt;&lt;BR /&gt;End of SingleRandomAccess_LCG section.&lt;BR /&gt;Begin of PTRANS section.&lt;BR /&gt;M: 5120&lt;BR /&gt;N: 5120&lt;BR /&gt;MB: 128&lt;BR /&gt;NB: 128&lt;BR /&gt;P: 2&lt;BR /&gt;Q: 4&lt;BR /&gt;TIME M N MB NB P Q TIME CHECK GB/s RESID&lt;BR /&gt;---- ----- ----- --- --- --- --- -------- ------ -------- -----&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;So it seems to bug into PTRANS. How can I solve this problem ?&lt;BR /&gt;Thx a lot,&lt;BR /&gt;Regards</description>
    <pubDate>Fri, 01 Oct 2010 08:10:53 GMT</pubDate>
    <dc:creator>Guillaume_De_Nayer</dc:creator>
    <dc:date>2010-10-01T08:10:53Z</dc:date>
    <item>
      <title>Problem with hpcc, intel mpi and intel mkl: PTRANS failed</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Problem-with-hpcc-intel-mpi-and-intel-mkl-PTRANS-failed/m-p/813820#M4138</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;We have a little cluster (4 nodes; each node 12 cores). I'm trying to test hpcc on it. So I have read:&lt;BR /&gt;&lt;A href="http://origin-software.intel.com/en-us/articles/performance-tools-for-software-developers-use-of-intel-mkl-in-hpcc-benchmark/" target="_blank"&gt;http://origin-software.intel.com/en-us/articles/performance-tools-for-software-developers-use-of-intel-mkl-in-hpcc-benchmark/&lt;/A&gt;&lt;BR /&gt;and have done step by step the things. My make.arch is:&lt;BR /&gt; &lt;BR /&gt;# &lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - shell --------------------------------------------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;SHELL = /bin/sh&lt;BR /&gt;#&lt;BR /&gt;CD = cd&lt;BR /&gt;CP = cp&lt;BR /&gt;LN_S = ln -s&lt;BR /&gt;MKDIR = mkdir&lt;BR /&gt;RM = /bin/rm -f&lt;BR /&gt;TOUCH = touch&lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - Platform identifier ------------------------------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;ARCH = $(arch)&lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - HPL Directory Structure / HPL library ------------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;TOPdir = ../../..&lt;BR /&gt;INCdir = $(TOPdir)/include&lt;BR /&gt;BINdir = $(TOPdir)/bin/$(ARCH)&lt;BR /&gt;LIBdir = $(TOPdir)/lib/$(ARCH)&lt;BR /&gt;#&lt;BR /&gt;HPLlib = $(LIBdir)/libhpl.a &lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - Message Passing library (MPI) --------------------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# MPinc tells the C compiler where to find the Message Passing library&lt;BR /&gt;# header files, MPlib is defined to be the name of the library to be &lt;BR /&gt;# used. The variable MPdir is only used for defining MPinc and MPlib.&lt;BR /&gt;#&lt;BR /&gt;MPdir = /opt/intel/impi/4.0.0&lt;BR /&gt;MPinc = -I$(MPdir)/include64&lt;BR /&gt;MPlib = -L$(MPdir)/lib64&lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - Linear Algebra library (BLAS or VSIPL) -----------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# LAinc tells the C compiler where to find the Linear Algebra library&lt;BR /&gt;# header files, LAlib is defined to be the name of the library to be &lt;BR /&gt;# used. The variable LAdir is only used for defining LAinc and LAlib.&lt;BR /&gt;#&lt;BR /&gt;LAdir = /opt/intel/mkl/10.2.5.035/lib/em64t&lt;BR /&gt;LAdir_local = ~/tmp/flops_test/mkl_local/lib/em64t&lt;BR /&gt;LAinc = -I/opt/intel/mkl/10.2.5.035/include -I/opt/intel/mkl/10.2.5.035/include/fftw&lt;BR /&gt;LAlib = $(LAdir)/libmkl_solver_ilp64.a -Wl,--start-group $(LAdir)/libmkl_intel_ilp64.a $(LAdir)/libmkl_intel_thread.a $(LAdir)/libmkl_core.a $(LAdir)/libmkl_blacs_intelmpi_ilp64.a $(LAdir_local)/libfftw2x_cdft_DOUBLE.a $(LAdir_local)/libfftw2xc_intel.a $(LAdir)/libmkl_cdft_core.a -Wl,--end-group -openmp -lpthread&lt;BR /&gt;&lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - F77 / C interface --------------------------------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;F2CDEFS = -DAdd_ -DF77_INTEGER=int -DStringSunStyle &lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - HPL includes / libraries / specifics -------------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc)&lt;BR /&gt;HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib) -lm&lt;BR /&gt;#&lt;BR /&gt;# - Compile time options -----------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;HPL_OPTS = -DUSING_FFTW -DMKL_INT=long -DLONG_IS_64BITS&lt;BR /&gt;# &lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;HPL_DEFS = $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES) &lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;# - Compilers / linkers - Optimization flags ---------------------------&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;#&lt;BR /&gt;CC = mpicc&lt;BR /&gt;CCNOOPT = $(HPL_DEFS) &lt;BR /&gt;CCFLAGS = $(HPL_DEFS) -O2 -xSSE4.2 -ansi-alias -ip&lt;BR /&gt;#&lt;BR /&gt;LINKER = mpicc&lt;BR /&gt;LINKFLAGS = &lt;BR /&gt;#&lt;BR /&gt;ARCHIVER = ar&lt;BR /&gt;ARFLAGS = r&lt;BR /&gt;RANLIB = echo&lt;BR /&gt;#&lt;BR /&gt;# ----------------------------------------------------------------------&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;It compiles without error. Then I take the input file and starts the program with mpirun -np 4 hpcc on our master. I get:&lt;BR /&gt;mpirun -np 8 hpcc&lt;BR /&gt;&lt;BR /&gt;WARNING: Unable to read mpd.hosts or list of hosts isn't provided. MPI job will be run on the current machine only.&lt;BR /&gt;rank 6 in job 1 master_34154 caused collective abort of all ranks&lt;BR /&gt; exit status of rank 6: killed by signal 9&lt;BR /&gt;rank 0 in job 1 master_34154 caused collective abort of all ranks&lt;BR /&gt; exit status of rank 0: killed by signal 11&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;and in the output file is:&lt;BR /&gt;########################################################################&lt;BR /&gt;This is the DARPA/DOE HPC Challenge Benchmark version 1.4.1 October 2003&lt;BR /&gt;Produced by Jack Dongarra and Piotr Luszczek&lt;BR /&gt;Innovative Computing Laboratory&lt;BR /&gt;University of Tennessee Knoxville and Oak Ridge National Laboratory&lt;BR /&gt;&lt;BR /&gt;See the source files for authors of specific codes.&lt;BR /&gt;Compiled on Oct 1 2010 at 09:16:38&lt;BR /&gt;Current time (1285919752) is Fri Oct 1 09:55:52 2010&lt;BR /&gt;&lt;BR /&gt;Hostname: 'master'&lt;BR /&gt;########################################################################&lt;BR /&gt;================================================================================&lt;BR /&gt;HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008&lt;BR /&gt;Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK&lt;BR /&gt;Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK&lt;BR /&gt;Modified by Julien Langou, University of Colorado Denver&lt;BR /&gt;================================================================================&lt;BR /&gt;&lt;BR /&gt;An explanation of the input/output parameters follows:&lt;BR /&gt;T/V : Wall time / encoded variant.&lt;BR /&gt;N : The order of the coefficient matrix A.&lt;BR /&gt;NB : The partitioning blocking factor.&lt;BR /&gt;P : The number of process rows.&lt;BR /&gt;Q : The number of process columns.&lt;BR /&gt;Time : Time in seconds to solve the linear system.&lt;BR /&gt;Gflops : Rate of execution for solving the linear system.&lt;BR /&gt;&lt;BR /&gt;The following parameter values will be used:&lt;BR /&gt;&lt;BR /&gt;N : 10240 &lt;BR /&gt;NB : 128 &lt;BR /&gt;PMAP : Row-major process mapping&lt;BR /&gt;P : 2 &lt;BR /&gt;Q : 4 &lt;BR /&gt;PFACT : Right &lt;BR /&gt;NBMIN : 4 &lt;BR /&gt;NDIV : 2 &lt;BR /&gt;RFACT : Crout &lt;BR /&gt;BCAST : 1ringM &lt;BR /&gt;DEPTH : 1 &lt;BR /&gt;SWAP : Mix (threshold = 64)&lt;BR /&gt;L1 : transposed form&lt;BR /&gt;U : transposed form&lt;BR /&gt;EQUIL : yes&lt;BR /&gt;ALIGN : 8 double precision words&lt;BR /&gt;&lt;BR /&gt;--------------------------------------------------------------------------------&lt;BR /&gt;&lt;BR /&gt;- The matrix A is randomly generated for each test.&lt;BR /&gt;- The following scaled residual check will be computed:&lt;BR /&gt; ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )&lt;BR /&gt;- The relative machine precision (eps) is taken to be 2.220446e-16&lt;BR /&gt;- Computational tests pass if scaled residuals are less than 16.0&lt;BR /&gt;&lt;BR /&gt;Begin of MPIRandomAccess section.&lt;BR /&gt;Running on 8 processors (PowerofTwo)&lt;BR /&gt;Total Main table size = 2^26 = 67108864 words&lt;BR /&gt;PE Main table size = 2^23 = 8388608 words/PE&lt;BR /&gt;Default number of updates (RECOMMENDED) = 268435456&lt;BR /&gt;CPU time used = 10.461410 seconds&lt;BR /&gt;Real time used = 18.035589 seconds&lt;BR /&gt;0.014883653 Billion(10^9) Updates per second [GUP/s]&lt;BR /&gt;0.001860457 Billion(10^9) Updates/PE per second [GUP/s]&lt;BR /&gt;Verification: CPU time used = 1.407786 seconds&lt;BR /&gt;Verification: Real time used = 1.413994 seconds&lt;BR /&gt;Found 0 errors in 67108864 locations (passed).&lt;BR /&gt;Current time (1285919771) is Fri Oct 1 09:56:11 2010&lt;BR /&gt;&lt;BR /&gt;End of MPIRandomAccess section.&lt;BR /&gt;Begin of StarRandomAccess section.&lt;BR /&gt;Main table size = 2^23 = 8388608 words&lt;BR /&gt;Number of updates = 33554432&lt;BR /&gt;CPU time used = 1.022845 seconds&lt;BR /&gt;Real time used = 1.023517 seconds&lt;BR /&gt;0.032783467 Billion(10^9) Updates per second [GUP/s]&lt;BR /&gt;Found 0 errors in 8388608 locations (passed).&lt;BR /&gt;Node(s) with error 0&lt;BR /&gt;Minimum GUP/s 0.032325&lt;BR /&gt;Average GUP/s 0.032776&lt;BR /&gt;Maximum GUP/s 0.033038&lt;BR /&gt;Current time (1285919773) is Fri Oct 1 09:56:13 2010&lt;BR /&gt;&lt;BR /&gt;End of StarRandomAccess section.&lt;BR /&gt;Begin of SingleRandomAccess section.&lt;BR /&gt;Node(s) with error 0&lt;BR /&gt;Node selected 2&lt;BR /&gt;Single GUP/s 0.050258&lt;BR /&gt;Current time (1285919775) is Fri Oct 1 09:56:15 2010&lt;BR /&gt;&lt;BR /&gt;End of SingleRandomAccess section.&lt;BR /&gt;Begin of MPIRandomAccess_LCG section.&lt;BR /&gt;Running on 8 processors (PowerofTwo)&lt;BR /&gt;Total Main table size = 2^26 = 67108864 words&lt;BR /&gt;PE Main table size = 2^23 = 8388608 words/PE&lt;BR /&gt;Default number of updates (RECOMMENDED) = 268435456&lt;BR /&gt;CPU time used = 11.008327 seconds&lt;BR /&gt;Real time used = 18.597349 seconds&lt;BR /&gt;0.014434071 Billion(10^9) Updates per second [GUP/s]&lt;BR /&gt;0.001804259 Billion(10^9) Updates/PE per second [GUP/s]&lt;BR /&gt;Verification: CPU time used = 1.382789 seconds&lt;BR /&gt;Verification: Real time used = 1.386738 seconds&lt;BR /&gt;Found 0 errors in 67108864 locations (passed).&lt;BR /&gt;Current time (1285919795) is Fri Oct 1 09:56:35 2010&lt;BR /&gt;&lt;BR /&gt;End of MPIRandomAccess_LCG section.&lt;BR /&gt;Begin of StarRandomAccess_LCG section.&lt;BR /&gt;Main table size = 2^23 = 8388608 words&lt;BR /&gt;Number of updates = 33554432&lt;BR /&gt;CPU time used = 1.036842 seconds&lt;BR /&gt;Real time used = 1.037567 seconds&lt;BR /&gt;0.032339536 Billion(10^9) Updates per second [GUP/s]&lt;BR /&gt;Found 0 errors in 8388608 locations (passed).&lt;BR /&gt;Node(s) with error 0&lt;BR /&gt;Minimum GUP/s 0.032164&lt;BR /&gt;Average GUP/s 0.032349&lt;BR /&gt;Maximum GUP/s 0.032528&lt;BR /&gt;Current time (1285919797) is Fri Oct 1 09:56:37 2010&lt;BR /&gt;&lt;BR /&gt;End of StarRandomAccess_LCG section.&lt;BR /&gt;Begin of SingleRandomAccess_LCG section.&lt;BR /&gt;Node(s) with error 0&lt;BR /&gt;Node selected 7&lt;BR /&gt;Single GUP/s 0.048609&lt;BR /&gt;Current time (1285919798) is Fri Oct 1 09:56:38 2010&lt;BR /&gt;&lt;BR /&gt;End of SingleRandomAccess_LCG section.&lt;BR /&gt;Begin of PTRANS section.&lt;BR /&gt;M: 5120&lt;BR /&gt;N: 5120&lt;BR /&gt;MB: 128&lt;BR /&gt;NB: 128&lt;BR /&gt;P: 2&lt;BR /&gt;Q: 4&lt;BR /&gt;TIME M N MB NB P Q TIME CHECK GB/s RESID&lt;BR /&gt;---- ----- ----- --- --- --- --- -------- ------ -------- -----&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;So it seems to bug into PTRANS. How can I solve this problem ?&lt;BR /&gt;Thx a lot,&lt;BR /&gt;Regards</description>
      <pubDate>Fri, 01 Oct 2010 08:10:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Problem-with-hpcc-intel-mpi-and-intel-mkl-PTRANS-failed/m-p/813820#M4138</guid>
      <dc:creator>Guillaume_De_Nayer</dc:creator>
      <dc:date>2010-10-01T08:10:53Z</dc:date>
    </item>
    <item>
      <title>Problem with hpcc, intel mpi and intel mkl: PTRANS failed</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Problem-with-hpcc-intel-mpi-and-intel-mkl-PTRANS-failed/m-p/813821#M4139</link>
      <description>I did a mistake:&lt;BR /&gt;I just copy the make.UNKNOWN from the hpl setup. I f I modify the -DF77_INTEGER=int into -DF77_INTEGER=long, the PTRANS test runs without problem.&lt;BR /&gt;&lt;BR /&gt;Now I get a *** glibc detected *** hpcc: free(): invalid pointer: 0x0000003647952a88 *** a the beginning of StarFFT section.&lt;BR /&gt;Any Ideas ?&lt;BR /&gt;&lt;BR /&gt;Regards,</description>
      <pubDate>Fri, 01 Oct 2010 08:51:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Problem-with-hpcc-intel-mpi-and-intel-mkl-PTRANS-failed/m-p/813821#M4139</guid>
      <dc:creator>Guillaume_De_Nayer</dc:creator>
      <dc:date>2010-10-01T08:51:26Z</dc:date>
    </item>
    <item>
      <title>Problem with hpcc, intel mpi and intel mkl: PTRANS failed</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Problem-with-hpcc-intel-mpi-and-intel-mkl-PTRANS-failed/m-p/813822#M4140</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;I suggest that you change LAdir in order to move the fftw2x wrapppers before the MKL interface library like this:&lt;BR /&gt;&lt;BR /&gt;LAlib = $(LAdir_local)/libfftw2x_cdft_DOUBLE.a $(LAdir_local)/libfftw2xc_intel.a $(LAdir)/libmkl_solver_ilp64.a -Wl,--start-group $(LAdir)/libmkl_intel_ilp64.a $(LAdir)/libmkl_intel_thread.a $(LAdir)/libmkl_core.a $(LAdir)/libmkl_blacs_intelmpi_ilp64.a $(LAdir)/libmkl_cdft_core.a -Wl,--end-group -openmp -lpthread&lt;BR /&gt;&lt;BR /&gt;The reason behind this is that MKL's interface layer (i.e. libmkl_intel_ilp64.a) contains pre-built &lt;B&gt;FFTW3&lt;/B&gt; wrappers and their implementations of fftw_free() are not compatible with &lt;B&gt;FFTW2&lt;/B&gt; used by HPCC.&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;-Vladimir&lt;BR /&gt;</description>
      <pubDate>Fri, 01 Oct 2010 10:29:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Problem-with-hpcc-intel-mpi-and-intel-mkl-PTRANS-failed/m-p/813822#M4140</guid>
      <dc:creator>Vladimir_Petrov__Int</dc:creator>
      <dc:date>2010-10-01T10:29:20Z</dc:date>
    </item>
    <item>
      <title>Problem with hpcc, intel mpi and intel mkl: PTRANS failed</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Problem-with-hpcc-intel-mpi-and-intel-mkl-PTRANS-failed/m-p/813823#M4141</link>
      <description>perfect! it works! thx a lot!</description>
      <pubDate>Fri, 01 Oct 2010 10:50:43 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Problem-with-hpcc-intel-mpi-and-intel-mkl-PTRANS-failed/m-p/813823#M4141</guid>
      <dc:creator>Guillaume_De_Nayer</dc:creator>
      <dc:date>2010-10-01T10:50:43Z</dc:date>
    </item>
  </channel>
</rss>

