Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Problem running code using MPI: ncpu is always 1 (Centos 7)

Michael_B1
Beginner
1,041 Views

I'm running Parallel Studio XE 2019 Update 3, and I've got the problem where, when I run a program compiled using MPI, the number of CPUs is always set to 1.  It runs, but only on 1 CPU.  (It also starts separate instances of the same code on the other CPUs, but the others crash because they don't get the input file.)  This is a mixed Fortran, C++ code, but the MPI part is called from the C++ portion, so I think this is the correct forum.  The code is run on the same computer I compiled it, with the command

mpirun -n 4 mycode < sphere.inp > sphere.out

This is a large code that has worked correctly on a different linux (maybe unix) system, when compiled using Intel Composer XE 2013.  That had some additional flags that I don't think are relevant, but I give the command below:

mpirun --bind-to-core --cpus-per-proc 1 -np 8 mycode -lmpi -lmpi < sphere.inp > sphere.out

Adding -lmpi once when I run it on the new system doesn't change the behavior.  If I add it twice, the code doesn't run at all.  The --bind-to-core --cpus-per-proc flags aren't recognized on the new system.

Running which -a mpirun gives four identical lines (I guess because of the duplicated lines in my path, see below), but no other version of mpirun.

/opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/bin/mpirun
/opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/bin/mpirun
/opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/bin/mpirun
/opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/bin/mpirun

 

I've searched, and the answers given online always seem to be that there's another version of mpirun, but I'm not sure how that is.  This is on a brand new install of Centos 7 (build 1810, so I think that's 7.6), made just so I can install Parallel Studio XE and compile this code.  I don't think there's another MPI version that comes with Centos by default.  And the which command doesn't show any other version.

Anyone got any ideas that I can try?

My .bashrc is given below, followed by my PATH.

# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
    . /etc/bashrc
fi

# User specific environment and startup programs
#
#  Add my personal bin directory.
#
PATH=$PATH:$HOME/bin

export PATH

source /opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/bin/mpivars.sh
source /opt/intel/compilers_and_libraries_2019/linux/bin/compilervars.sh intel64

Here's my path, with line returns added for clarity. (Yes, some directories are listed two to four times.  I don't know why that is, but I don't think that will matter.)

/opt/intel/compilers_and_libraries_2019.3.199/linux/bin/intel64:
/opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/libfabric/bin:
/opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/bin:
/opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/libfabric/bin:
/opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/bin:
/opt/intel/compilers_and_libraries_2019.3.199/linux/bin/intel64:
/opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/libfabric/bin:
/opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/bin:
/opt/intel/debugger_2019/gdb/intel64/bin:
/opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/libfabric/bin:
/opt/intel/compilers_and_libraries_2019.3.199/linux/mpi/intel64/bin:
/usr/lib64/qt-3.3/bin:
/usr/local/bin:
/usr/local/sbin:
/usr/bin:
/usr/sbin:
/bin:
/sbin:
/home/zenbeam/bin:
/home/zenbeam/bin

0 Kudos
11 Replies
Viet_H_Intel
Moderator
1,041 Views

Will you able to submit a ticket at http://www.intel.com/supporttickets?  so that a MPI expert will look at your case. 

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,041 Views

>> mpirun -n 4 mycode < sphere.inp > sphere.out

When you < direct input file, only one (primarily the rank 0) process should read the input file.

Same with the > directed output file.

IIF you require all processes to read/write these files, please consider using

mpirun -n 4 mycode sphere.inp sphere.out

Then using input args 1 2 as filenames for input file and output file *** however as written without path qualifications, they would be located relative to each process current directory (and current device on Windows). Additionally, your code would have to have means to coordinate the writes (e.g. file lock during write).

Jim Dempsey

0 Kudos
Michael_B1
Beginner
1,041 Views

Viet Hoang (Intel) wrote:

Will you able to submit a ticket at http://www.intel.com/supporttickets?  so that a MPI expert will look at your case. 

When I go to that link and sign in, or if I am already signed in and try to go that link, I get to a page that says 403: Access Forbidden.

0 Kudos
Michael_B1
Beginner
1,041 Views

jimdempseyatthecove (Blackbelt) wrote:

>> mpirun -n 4 mycode < sphere.inp > sphere.out

When you < direct input file, only one (primarily the rank 0) process should read the input file.

Same with the > directed output file.

IIF you require all processes to read/write these files, please consider using

mpirun -n 4 mycode sphere.inp sphere.out

Then using input args 1 2 as filenames for input file and output file *** however as written without path qualifications, they would be located relative to each process current directory (and current device on Windows). Additionally, your code would have to have means to coordinate the writes (e.g. file lock during write).

Jim Dempsey

Only sending the input file to the one process is correct behavior.  The MPI portion is called later (a block matrix factorization).  This is a code I've compiled and successfully run on a different system using the similar call I give in my OP.  The main program is Fortran, not C++, although the MPI portion of the code is C++.

I did just now try running the code without the "<" and ">" redirects, and it doesn't run at all.  All four processes crash due to not having an input file, instead of only three crashing.

0 Kudos
Viet_H_Intel
Moderator
1,041 Views

I am not sure what happened at your end, but I don't have any issue signing in.

0 Kudos
Michael_B1
Beginner
1,041 Views

Viet Hoang (Intel) wrote:

I am not sure what happened at your end, but I don't have any issue signing in.

I don't know either.  I've tried both Firefox and Chrome on my Windows laptop, and I've tried Firefox on a different computer running linux, and all have the same problem.

Actually, now that I look carefully at the URL of the 403: Access Forbidden page, it's complaining about my "nickname":

https://www.intel.com/content/www/us/en/forms/support-registration-form.html?ErrorCode=41&ErrorDescription=Execution+error&ErrorDetails=DML%3AInsert+failed.+First+exception+on+row+0%3B+first+error%3A+DUPLICATE_COMM_NICKNAME%2C+Duplicate+Nickname.%3Cbr%3EAnother+user+has+already+selected+this+nickname.%3Cbr%3EPlease+select+another.%3A+%5BCommunityNi...

 

I'm not sure if that's the "B, Michael" that's being displayed here, or the username they had me pick when I signed up, and then don't seem to use...  I'll try changing my usernames in my profile until I get it to work.  But putting the error description at the end of a URL, where we can't see it unless we explicitly look, is a weird way to notify us of what the problem is...

0 Kudos
Michael_B1
Beginner
1,041 Views

Well, I've gone to my profile and tried changing my name and also my Display Name several times each, but I still get the 403: Access Forbidden page.  I can't change the username that I log in with, as far as I know.  I'll try making another completely new account, to see if I can submit a ticket.

0 Kudos
Michael_B_0520
Beginner
1,041 Views

I made an entirely new account.  New username, different email address, different display name, and I still get the 403: Access Forbidden page.  It has the same stupid URL:

https://www.intel.com/content/www/us/en/forms/support-registration-form.html?ErrorCode=41&ErrorDescription=Execution+error&ErrorDetails=DML%3AInsert+failed.+First+exception+on+row+0%3B+first+error%3A+DUPLICATE_COMM_NICKNAME%2C+Duplicate+Nickname.%3Cbr%3EAnother+user+has+already+selected+this+nickname.%3Cbr%3EPlease+select+another.%3A+%5BCommunityNi...

 

How are we supposed to get any help if the Support Ticket page won't let us sign in???

0 Kudos
Viet_H_Intel
Moderator
1,041 Views

Can you try this?

 

-On the right upper corner, click on sign in

 -click on create an account

-Another screen will appear to fill out the registration form:

-Click on Next Steps, agree on the legal terms and customer account will finally be created.

0 Kudos
Viet_H_Intel
Moderator
1,041 Views

Please follow attached file to create an account

0 Kudos
Michael_B1
Beginner
1,041 Views

Viet Hoang (Intel) wrote:

Can you try this?

 

-On the right upper corner, click on sign in

 -click on create an account

-Another screen will appear to fill out the registration form:

-Click on Next Steps, agree on the legal terms and customer account will finally be created.

I made a third account, and that one was able to submit a ticket.  Which was promptly closed because that account doesn't have Parallel Studio XE:  "Our records indicate that you do not have a supported product associated with your account. You need a supported product in order to qualify for Priority Support. As such, this ticket is being closed."

For this account, it still dumps me to the 403: Access Forbidden page with the error message hidden in the URL, and doesn't give me any clue about what a nickname actually is, or how to change it.

0 Kudos
Reply