Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Shakeri__Ali
Beginner
482 Views

Fatal error in PMPI_Type_size: Invalid datatype, error stack:

I am trying to use -trace flag to get .stf output file for traceanalyzer. I run my job using this script:

#!/bin/bash -l
#PBS -l nodes=2:ppn=40,walltime=00:10:00
#PBS -N GranularGas
#PBS -o granularjob.out -e granularjob.err

export MPIRUN=/apps/intel/ComposerXE2018/compilers_and_libraries_2018.2.199/linux/mpi/intel64/bin/mpirun
export CODEPATH=${HOME}/GranularGas/1.1_parallel_GranularGas/build
source /apps/intel/ComposerXE2018/itac/2018.2.020/intel64/bin/itacvars.sh

cd ${CODEPATH}
${MPIRUN} -trace ${CODEPATH}/GranularGas

After submitting my job, I get the following error:

Fatal error in PMPI_Type_size: Invalid datatype, error stack:
PMPI_Type_size(131): MPI_Type_size(INVALID DATATYPE) failed
PMPI_Type_size(76).: Invalid datatype

and I get a ".prot" file. Where this error come from? How can I fix it?

For more information I am using Intel compiler 18.0.2 and Intel MPI 20180125.

 

UPDATE:

I checked a simplified version (turning off all unnecessary things) of my code. However, I get the same error again.

Then I tried an oversimplified version of my code (by removing all MPI derived data types). Now I can generate .stf file and open it with traceanalyzer. Could it be a problem with my derived data types (MPI_Type_create_struct and MPI_Type_create_subarray) that I used extensively?

For more information: My code works correctly with ggc and intel compilers and different MPI implementations.

0 Kudos
3 Replies
Shakeri__Ali
Beginner
482 Views

Update:

I tested mpirun -trace with a simplified C++ code which still contains MPI_Type_create() function calls. However, I get the same error as before.

Then, I decided to do a test with an oversimplified code (which does nothing actually) and now I have the .stf file and I can use the traceanalyzer software.

It seems that there is a problem with my derive datatypes (MPI_Type_contiguous and MPI_Type_create_subarray).

James_T_Intel
Moderator
482 Views

I apologize that this thread was missed.  This error should be resolved in Intel® Trace Analyzer and Collector 2019.

Shakeri__Ali
Beginner
482 Views

The tool works with 4 processes but fails for 8 processes and more
I am using Intel® Trace Analyzer and Collector 2019. now. I can use the tool when I have 4 processes. However, I still get the same error when I use 8 processes or more. my new jobfile is:

#!/bin/bash -l
#PBS -l nodes=1:ppn=40,walltime=00:10:00
#PBS -N Multi_optimization
#PBS -o granularjob.out -e granularjob.err
module load intelmpi/2019up02-intel
module load itac/2019up02
# This will run itacvars.sh and other scripts
source /apps/intel/ComposerXE2019/parallel_studio_xe_2019.1.053/bin/psxevars.sh
# jobs always start in $HOME -
export CODEPATH=${HOME}/GranularGas/Base/Multi_core/build
cd ${CODEPATH}
# run
mpirun -trace -n 4 ${CODEPATH}/GranularGas

 

The above jobfile produces the .stf files and I can use traceanalyzer software to open it. However when I use 8 processes, i.e., mpirun -trace -n 8, I get the following error as before:

Abort(671757827) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Type_size: Invalid datatype, error stack:
PMPI_Type_size(120): MPI_Type_size(INVALID DATATYPE) failed
PMPI_Type_size(69).: Invalid datatype
[cli_0]: readline failed
Abort(671757827) on node 7 (rank 7 in comm 0): Fatal error in PMPI_Type_size: Invalid datatype, error stack:
PMPI_Type_size(120): MPI_Type_size(INVALID DATATYPE) failed
PMPI_Type_size(69).: Invalid datatype

 

For more information, I am using Intel compiler 2019, Intel MPI 2019 and  Intel® Trace Analyzer and Collector 2019.

Question
How can I use the ITAC tool for more than 8 cores?

Reply