- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
The Siesta-2.0.2 (Fortran 90 application) is compiled with Intel Fortran 10, MKL 10 and openmpi-1.3.3. The application is linked with Lapack, blas, scalapack, blacs (MKL). Following is the link line:
MKLPATH=/opt/intel/mkl/10.0.5.025/lib/em64t
MKLPRL= -L$(MKLPATH) -lmkl $(MKLPATH)/libmkl_scalapack_lp64.a $(MKLPATH)/libmkl_solver_lp64.a -Wl,--start-group $(MKLPATH)/libmkl_intel_lp64.a $(MKLPATH)/libmkl_intel_thread.a $(MKLPATH)/libmkl_core.a $(MKLPATH)/libmkl_blacs_openmpi_lp64.a -Wl,--end-group -openmp -lpthread
Compilation is successful.
The siesta jobs run well on single node with a good performance scaling from 4 to 8 cores. But, if the same job is run on multiple nodes it runs very slow - takes 3-4times the single node job. The same job which run on gigabit with three nodes, also took 4times the single node job.
To cross check this we compiled Siesta-2.0.2 with mvapich2 with the following link line:
MKLP=-L/opt/intel/mkl/10.0.5.025/lib/em64t /opt/intel/mkl/10.0.5.025/lib/em64t/libmkl_scalapack_lp64.a /opt/intel/mkl/10.0.5.025/lib/em64t/libmkl_solver_lp64.a -Wl,--start-group -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_lp64 -Wl,--end-group -lguide -lpthread
In this case, the jobs scale up to 3 nodes. Afterwards with more number of cores job goes into sgementation fault. We suspect this problem is due to Mvapich2. Also the higher input parameters job fail to run on multiple nodes with MPI communication error. This also we suspect on mvapich2.
But with Open MPI, we suspect on MKL blacs and scalapack libraries. Can you please help us to resolve this issue?
Thanks
The Siesta-2.0.2 (Fortran 90 application) is compiled with Intel Fortran 10, MKL 10 and openmpi-1.3.3. The application is linked with Lapack, blas, scalapack, blacs (MKL). Following is the link line:
MKLPATH=/opt/intel/mkl/10.0.5.025/lib/em64t
MKLPRL= -L$(MKLPATH) -lmkl $(MKLPATH)/libmkl_scalapack_lp64.a $(MKLPATH)/libmkl_solver_lp64.a -Wl,--start-group $(MKLPATH)/libmkl_intel_lp64.a $(MKLPATH)/libmkl_intel_thread.a $(MKLPATH)/libmkl_core.a $(MKLPATH)/libmkl_blacs_openmpi_lp64.a -Wl,--end-group -openmp -lpthread
Compilation is successful.
The siesta jobs run well on single node with a good performance scaling from 4 to 8 cores. But, if the same job is run on multiple nodes it runs very slow - takes 3-4times the single node job. The same job which run on gigabit with three nodes, also took 4times the single node job.
To cross check this we compiled Siesta-2.0.2 with mvapich2 with the following link line:
MKLP=-L/opt/intel/mkl/10.0.5.025/lib/em64t /opt/intel/mkl/10.0.5.025/lib/em64t/libmkl_scalapack_lp64.a /opt/intel/mkl/10.0.5.025/lib/em64t/libmkl_solver_lp64.a -Wl,--start-group -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_lp64 -Wl,--end-group -lguide -lpthread
In this case, the jobs scale up to 3 nodes. Afterwards with more number of cores job goes into sgementation fault. We suspect this problem is due to Mvapich2. Also the higher input parameters job fail to run on multiple nodes with MPI communication error. This also we suspect on mvapich2.
But with Open MPI, we suspect on MKL blacs and scalapack libraries. Can you please help us to resolve this issue?
Thanks
Link Copied
7 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - sangamesh
Hi,
The Siesta-2.0.2 (Fortran 90 application) is compiled with Intel Fortran 10, MKL 10 and openmpi-1.3.3. The application is linked with Lapack, blas, scalapack, blacs (MKL). Following is the link line:
MKLPATH=/opt/intel/mkl/10.0.5.025/lib/em64t
MKLPRL= -L$(MKLPATH) -lmkl $(MKLPATH)/libmkl_scalapack_lp64.a $(MKLPATH)/libmkl_solver_lp64.a -Wl,--start-group $(MKLPATH)/libmkl_intel_lp64.a $(MKLPATH)/libmkl_intel_thread.a $(MKLPATH)/libmkl_core.a $(MKLPATH)/libmkl_blacs_openmpi_lp64.a -Wl,--end-group -openmp -lpthread
Compilation is successful.
The siesta jobs run well on single node with a good performance scaling from 4 to 8 cores. But, if the same job is run on multiple nodes it runs very slow - takes 3-4times the single node job. The same job which run on gigabit with three nodes, also took 4times the single node job.
To cross check this we compiled Siesta-2.0.2 with mvapich2 with the following link line:
MKLP=-L/opt/intel/mkl/10.0.5.025/lib/em64t /opt/intel/mkl/10.0.5.025/lib/em64t/libmkl_scalapack_lp64.a /opt/intel/mkl/10.0.5.025/lib/em64t/libmkl_solver_lp64.a -Wl,--start-group -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_lp64 -Wl,--end-group -lguide -lpthread
In this case, the jobs scale up to 3 nodes. Afterwards with more number of cores job goes into sgementation fault. We suspect this problem is due to Mvapich2. Also the higher input parameters job fail to run on multiple nodes with MPI communication error. This also we suspect on mvapich2.
But with Open MPI, we suspect on MKL blacs and scalapack libraries. Can you please help us to resolve this issue?
Thanks
The Siesta-2.0.2 (Fortran 90 application) is compiled with Intel Fortran 10, MKL 10 and openmpi-1.3.3. The application is linked with Lapack, blas, scalapack, blacs (MKL). Following is the link line:
MKLPATH=/opt/intel/mkl/10.0.5.025/lib/em64t
MKLPRL= -L$(MKLPATH) -lmkl $(MKLPATH)/libmkl_scalapack_lp64.a $(MKLPATH)/libmkl_solver_lp64.a -Wl,--start-group $(MKLPATH)/libmkl_intel_lp64.a $(MKLPATH)/libmkl_intel_thread.a $(MKLPATH)/libmkl_core.a $(MKLPATH)/libmkl_blacs_openmpi_lp64.a -Wl,--end-group -openmp -lpthread
Compilation is successful.
The siesta jobs run well on single node with a good performance scaling from 4 to 8 cores. But, if the same job is run on multiple nodes it runs very slow - takes 3-4times the single node job. The same job which run on gigabit with three nodes, also took 4times the single node job.
To cross check this we compiled Siesta-2.0.2 with mvapich2 with the following link line:
MKLP=-L/opt/intel/mkl/10.0.5.025/lib/em64t /opt/intel/mkl/10.0.5.025/lib/em64t/libmkl_scalapack_lp64.a /opt/intel/mkl/10.0.5.025/lib/em64t/libmkl_solver_lp64.a -Wl,--start-group -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_lp64 -Wl,--end-group -lguide -lpthread
In this case, the jobs scale up to 3 nodes. Afterwards with more number of cores job goes into sgementation fault. We suspect this problem is due to Mvapich2. Also the higher input parameters job fail to run on multiple nodes with MPI communication error. This also we suspect on mvapich2.
But with Open MPI, we suspect on MKL blacs and scalapack libraries. Can you please help us to resolve this issue?
Thanks
for MKL 10.0 you are using version v. 1.1.2are supported only the following versions of OpenMPI
* Open MPI 1.1.2, 1.1.5, and 1.2 found at http://www.open-mpi.org
and forthe latest MKL version(10.2 update 2)
* Open MPI 1.2.x (http://www.open-mpi.org)
You can check the problem with these version and if any further problem please let us know.
--Gennady
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Gennady Fedorov (Intel)
for MKL 10.0 you are using version v. 1.1.2are supported only the following versions of OpenMPI
* Open MPI 1.1.2, 1.1.5, and 1.2 found at http://www.open-mpi.org
and forthe latest MKL version(10.2 update 2)
* Open MPI 1.2.x (http://www.open-mpi.org)
You can check the problem with these version and if any further problem please let us know.
--Gennady
As you mentioned, I tried to install Open MPI-1.2 which failed with openib error. So now I switched to another option: latest MKL + Open MPI-1.2.x. In this case, the siesta is failing with mkl library linking error:
/opt/mpi/openmpi/1.2.8/intel/bin/mpif90 -o siesta
automatic_cell.o arw.o atomlwf.o bands.o bessph.o cgwf.o chkdim.o chkgmx.o chempot.o coceri.o constr.o coxmol.o cross.o denmat.o denmatlomem.o detover.o dfscf.o dhscf.o diagon.o digcel.o fft3d.o diagg.o diagk.o diagkp.o diag2g.o diag2k.o diagpol.o diagsprl.o dipole.o dismin.o dnaefs.o dot.o efield.o egandd.o ener3.o ener3lomem.o extrapol.o extrapolon.o fixed.o fsiesta.o gradient.o gradientlomem.o grdsam.o hsparse.o idiag.o initatom.o initdm.o inver.o iodm.o iohs.o iolwf.o iozm.o ipack.o iopipes.o kgrid.o kgridinit.o kinefsm.o ksv.o ksvinit.o kpoint_grid.o find_kgrid.o linpack.o madelung.o matel.o meshmatrix.o memory.o meshsubs.o m_check_supercell.o mulliken.o minvec.o naefs.o neighb.o m_non_collinear.o ordern.o outcell.o outcoor.o paste.o pdos.o pdosg.o pdosk.o pdoskp.o phirphi.o pixmol.o plcharge.o propor.o pulayx.o ranger.o ran3.o reclat.o redcel.o reinit.o reord.o rhoofd.o rhoofdsp.o rhooda.o savepsi.o shaper.o timer.o vmb.o vmat.o vmatsp.o volcel.o xc.o xijorb.o cellxc.o cdiag.o rdiag.o cgvc.o cgvc_zmatrix.o iocg.o ioeig.o iofa.o iokp.o iomd.o repol.o typecell.o ofc.o poison.o readsp.o radfft.o siesta.o io.o spin_init.o coor.o atm_transfer.o broadcast_basis.o eggbox.o dsyevds.o zheevds.o optical.o phirphi_opt.o reoptical.o transition_rate.o initparallel.o show_distribution.o setspatial.o setatomnodes.o uncell.o cart2frac.o obc.o precision.o sys.o m_cell.o recipes.o files.o spatial.o parallel.o parallelsubs.o parsing.o chemical.o xcmod.o atom.o atmparams.o m_mpi_utils.o m_fdf_global.o m_history.o m_iorho.o atmfuncs.o listsc.o memoryinfo.o m_memory.o numbvect.o sorting.o atomlist.o atm_types.o old_atmfuncs.o radial.o m_smearing.o alloc.o phonon.o spher_harm.o periodic_table.o version.o timestamp.o basis_types.o xml.o pseudopotential.o basis_specs.o basis_io.o onmod.o densematrix.o writewave.o on_subs.o fermid.o m_broyddj.o electrostatic.o mneighb.o globalise.o siesta_cmlsubs.o siesta_cml.o units.o zmatrix.o m_broyden_mixing.o forhar.o m_walltime.o m_wallclock.o m_iostruct.o nlefsm.o overfsm.o conjgr.o conjgr_old.o redata.o m_broyddj_nocomm.o broyden_optim.o ioxv.o dynamics.o md_out.o nag.o pxf.o libfdf.a libwxml.a libxmlparser.a libmpi_f90.a
/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_scalapack_ilp64.a /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_solver_ilp64.a -Wl,--start-group /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_intel_ilp64.a /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_intel_thread.a /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_core.a /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_blacs_openmpi_ilp64.a -Wl,--end-group -openmp -lpthread
/opt/intel/fce/10.1.008/lib/libimf.so: warning: warning: feupdateenv is not implemented and will always fail
/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_core.a(zunmtr.o): In function `mkl_lapack_zunmtr':
__tmp_zunmtr.f:(.text+0x57f): undefined reference to `mkl_lapack_zunmql'
__tmp_zunmtr.f:(.text+0x669): undefined reference to `mkl_lapack_zunmqr'
/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_core.a(zstedc.o): In function `mkl_lapack_zstedc':
__tmp_zstedc.f:(.text+0x8ae): undefined reference to `mkl_lapack_zlaed0'
/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_core.a(dsteqr.o): In function `mkl_lapack_dsteqr':
__tmp_dsteqr.f:(.text+0xe44): undefined reference to `mkl_lapack_dlasr3'
__tmp_dsteqr.f:(.text+0x1556): undefined reference to `mkl_lapack_dlasr3'
I got this link line from the Intel MKL link line advisor. Please help me to resolve this error.
Thanks,
Sangamesh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I got it compiled sucessfully by adding two more(shared) libraries manually. Now the link line is as follows:
-L/opt/intel/mkl/10.2.2.025/lib/em64t -lmkl_lapack -lmkl_core /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_scalapack_ilp64.a /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_solver_ilp64.a -Wl,--start-group /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_intel_ilp64.a /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_intel_thread.a /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_core.a /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_blacs_openmpi_ilp64.a -Wl,--end-group -openmp -lpthread
Eventhough the static library libmkl_core.a is present, is still required to use dynamic version of same library i.e. -lmkl_core?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It got compiled successfully, but its giving error during runtime:
*** libmkl_mc.so *** failed with error : /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_mc.so: undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp
*** libmkl_def.so *** failed with error : /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_def.so: undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp
MKL FATAL ERROR: Cannot load neither libmkl_mc.so nor libmkl_def.so
*** libmkl_mc.so *** failed with error : /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_mc.so: undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp
*** libmkl_def.so *** failed with error : /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_def.so: undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp
MKL FATAL ERROR: Cannot load neither libmkl_mc.so nor libmkl_def.so
sta: 3.50369 21.66402 28.34401 2 3
siesta: 1.18878 22.99900 28.34777 2 4
siesta: 1.18768 41.05924 28.34921 1 14
initatomlists: Number of atoms, orbitals, and projectors: 14 166 210
*** libmkl_mc.so *** failed with error : /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_mc.so: undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp
*** libmkl_def.so *** failed with error : /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_def.so: undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp
MKL FATAL ERROR: Cannot load neither libmkl_mc.so nor libmkl_def.so
This is problem with MKL itself. How to resolve it?
*** libmkl_mc.so *** failed with error : /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_mc.so: undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp
*** libmkl_def.so *** failed with error : /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_def.so: undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp
MKL FATAL ERROR: Cannot load neither libmkl_mc.so nor libmkl_def.so
*** libmkl_mc.so *** failed with error : /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_mc.so: undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp
*** libmkl_def.so *** failed with error : /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_def.so: undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp
MKL FATAL ERROR: Cannot load neither libmkl_mc.so nor libmkl_def.so
sta: 3.50369 21.66402 28.34401 2 3
siesta: 1.18878 22.99900 28.34777 2 4
siesta: 1.18768 41.05924 28.34921 1 14
initatomlists: Number of atoms, orbitals, and projectors: 14 166 210
*** libmkl_mc.so *** failed with error : /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_mc.so: undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp
*** libmkl_def.so *** failed with error : /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_def.so: undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp
MKL FATAL ERROR: Cannot load neither libmkl_mc.so nor libmkl_def.so
This is problem with MKL itself. How to resolve it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How does it do if you use only .so libraries? How about keeping the MKL threaded (OpenMP) versions out as a first step, in case they conflict with the way you built openmpi, particularly if you have a reason for not specifying libiomp? If you think you have a reason for mixing static and dynamic libraries, or multiple OpenMP libraries, you will have to take on the problem diagnosis yourself.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - tim18
How does it do if you use only .so libraries? How about keeping the MKL threaded (OpenMP) versions out as a first step, in case they conflict with the way you built openmpi, particularly if you have a reason for not specifying libiomp? If you think you have a reason for mixing static and dynamic libraries, or multiple OpenMP libraries, you will have to take on the problem diagnosis yourself.
If MKL threading(OpenMP) is not used, then its not able to compile:
/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_scalapack_ilp64.a /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_solver_ilp64_sequential.a -Wl,--start-group /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_intel_ilp64.a /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_sequential.a /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_core.a /opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_blacs_openmpi_ilp64.a -Wl,--end-group -lpthread
/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_core.a(xzhetrd.o): In function `mkl_lapack_xzhetrd':
__tmp_xzhetrd.f:(.text+0x447): undefined reference to `mkl_lapack_zlatrd'
__tmp_xzhetrd.f:(.text+0x903): undefined reference to `mkl_lapack_zlatrd'
/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_core.a(xdsytrd.o): In function `mkl_lapack_xdsytrd':
__tmp_xdsytrd.f:(.text+0x3ff): undefined reference to `mkl_lapack_dlatrd'
__tmp_xdsytrd.f:(.text+0x833): undefined reference to `mkl_lapack_dlatrd'
/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_core.a(zunmtr.o): In function `mkl_lapack_zunmtr':
__tmp_zunmtr.f:(.text+0x57f): undefined reference to `mkl_lapack_zunmql'
__tmp_zunmtr.f:(.text+0x669): undefined reference to `mkl_lapack_zunmqr'
/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_core.a(zstedc.o): In function `mkl_lapack_zstedc':
If I modify the link line with adding two shared libraries,
MKLPRL=-L$(MKLPATH) -lmkl_lapack -lmkl_core $(MKLPATH)/libmkl_scalapack_ilp64.a $(MKLPATH)/libmkl_solver_ilp64_sequential.a $(MKLPATH)/libmkl_intel_ilp64.a $(MKLPATH)/libmkl_sequential.a $(MKLPATH)/libmkl_blacs_openmpi_ilp64.a -lpthread
it compiles well but during runtime it fails with MKL FATAL ERROR(shown in previous post).
Regarding libiomp5 lib, did you mean to link it specifically i.e. -liomp5?
I dont have a reason to mix static & dynamic libraries. Since it didn't compile with link line provided MKL link advisor, so as a trial I tried to link static lapack i.e. libmkl_blas95_ilp64.a which didn't work. Then I used dynamic libraries, in which it got compiled but failed during runtime.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, -liomp5 should make a dynamic link to the OpenMP run-time support, which is required for the MKL thread selection. If not all functions are available in dynamic libraries for openmpi, I suppose you should try to use entirely MKL static libraries. Dynamic libiomp5 should be OK, never using a static libiomp5. If there are circular dependencies in static libraries which you didn't include in the begin.. end link loop, you could move those inside the loop.
I've been asking several organizations how they test openmpi; it seems somewhat lacking, sometimes limited to single node and possibly 2 nodes with GbE connection. So, I've run into problems myself in this area.
I've been asking several organizations how they test openmpi; it seems somewhat lacking, sometimes limited to single node and possibly 2 nodes with GbE connection. So, I've run into problems myself in this area.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page