- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have a small cluster (head node + 4 nodes with 16 cores) using Intel infiniband. This cluster is under CentOS 6.6 (with the kernel of CentOS 6.5).
On this cluster Intel Parallel Studio XE 2015 is installed. I_MPI_FABRICS is set per default to "tmi" only.
When I start a job (using torque+maui) on several nodes, for example this one:
#!/bin/bash #PBS -N IMB-MPI1_intelmpi #PBS -l walltime=2:00:00 #PBS -l nodes=3:ppn=4 cd $PBS_O_WORKDIR export I_MPI_FABRICS=tmi mpirun IMB-MPI1
The job is running fine without any problem.
Now I start a job on a node:
#!/bin/bash #PBS -N IMB-MPI1_intelmpi #PBS -l walltime=2:00:00 #PBS -l nodes=1:ppn=16 cd $PBS_O_WORKDIR export I_MPI_FABRICS=tmi mpirun IMB-MPI1
This job does not start and I get this message:
can't open /dev/ipath, network down tmi fabric is not available and fallback fabric is not enabled
Is it normal?
If I set as default I_MPI_FABRICS=dapl, I don't have this problem at all.
How can I solve that?
Best regards,
Guillaume
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If I set export
I_MPI_FABRICS=tmi:tmi
the problem appear too.
If I set export
I_MPI_FABRICS=shm:tmi
the problem appear.
If I set export
I_MPI_FABRICS=dapl:tmi
the problem does not appear.
Any ideas?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page