- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm attempting to use mpitune to get optimized IMPI environment settings for an application.
I ran it with the following command:
mpitune -d -hf $nodelist -od $pwd -avd min -pm hydra -a \"mpirun -ppn 16 -np 1001 ./myapplication\"
During the tuning, I got the following critical errors:
27'Dec'12 16:00:24 | MTWTN_0 : Starting range cycle...
27'Dec'12 16:00:24 | MTWTN_0 : Starting iteration cycle...
27'Dec'12 16:07:32 | MTWTN_0 : Complete.
27'Dec'12 16:07:32 | MTWTN_0 : No one iteration was completed successfully.
27'Dec'12 16:07:32 | MTWTN_0 : Ranges' loop finished.
27'Dec'12 16:07:32 | MTWTN_0 : List of threshold times is:
{
}
27'Dec'12 16:07:32 ERR | Error in thread 'MTWTN_0': list index out of range
27'Dec'12 16:07:32 CER | A critical error has occurred!
--
27'Dec'12 16:07:32 | MTWTN_1 : Starting range cycle...
27'Dec'12 16:07:32 | MTWTN_1 : Starting iteration cycle...
27'Dec'12 16:13:59 | MTWTN_1 : Complete.
27'Dec'12 16:13:59 | MTWTN_1 : No one iteration was completed successfully.
27'Dec'12 16:13:59 | MTWTN_1 : Ranges' loop finished.
27'Dec'12 16:13:59 | MTWTN_1 : List of threshold times is:
{
}
27'Dec'12 16:13:59 ERR | Error in thread 'MTWTN_1': list index out of range
27'Dec'12 16:13:59 CER | A critical error has occurred!
--
27'Dec'12 16:14:00 | MTWTN_2 : Starting range cycle...
27'Dec'12 16:14:00 | MTWTN_2 : Starting iteration cycle...
27'Dec'12 16:20:26 | MTWTN_2 : Complete.
27'Dec'12 16:20:26 | MTWTN_2 : No one iteration was completed successfully.
27'Dec'12 16:20:26 | MTWTN_2 : Ranges' loop finished.
27'Dec'12 16:20:26 | MTWTN_2 : List of threshold times is:
{
}
27'Dec'12 16:20:26 ERR | Error in thread 'MTWTN_2': list index out of range
27'Dec'12 16:20:26 CER | A critical error has occurred!
27'Dec'12 17:11:34 | Attention! No results have been obtained during current tuning process. It may be caused by:
- Tuning process has not been completed at all due to one of follow reasons:
* Time limitations
* Critical errors in process
* Abort of the process by user
* Other
Does anyone know what these errors refer to or a way to fix them? I was unable to get any usable output from mpitune due to them I believe.
Thanks!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Bryan,
Unfortunately, I haven't seen these errors before so I'll try to reproduce this from my side. First off, what version of the Intel MPI Library are you using? I would recommend upgrading to the latest Intel MPI 4.1 (you can grab it from the Intel Registration Center).
I see you're using the application-specific tuning. Does this also happen when you do a cluster-specific tuning? That'll help to narrow it down if it's an issue with the mpitune script itself or with how mpitune calls your application. To do cluster-only tuning, just omit everything after and including the -a flag ("mpitune -d -hf $nodelist -od $pwd -avd min -pm hydra"). Let me know how that goes.
Also, what does your $nodelist file look like?
Looking forward to hearing back soon.
Regards,
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Gergana,
Have you had a chance to look into this any further?
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I seem to be facing a similar problem. I have tried using mpitune for the first time, and I have the same errors show up:
ERROR:root:code for hash md5 was not found.
Traceback (most recent call last):
File "/p/pdsd/Intel_MPI/Software/Python/python-2.7.2-linux-intel64-rhel5.7/lib/python2.7/hashlib.py", line 139, in <module>
File "/p/pdsd/Intel_MPI/Software/Python/python-2.7.2-linux-intel64-rhel5.7/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor
ValueError: unsupported hash type md5
ERROR:root:code for hash sha1 was not found.
Traceback (most recent call last):
File "/p/pdsd/Intel_MPI/Software/Python/python-2.7.2-linux-intel64-rhel5.7/lib/python2.7/hashlib.py", line 139, in <module>
File "/p/pdsd/Intel_MPI/Software/Python/python-2.7.2-linux-intel64-rhel5.7/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor
ValueError: unsupported hash type sha1
ERROR:root:code for hash sha224 was not found.
Traceback (most recent call last):
File "/p/pdsd/Intel_MPI/Software/Python/python-2.7.2-linux-intel64-rhel5.7/lib/python2.7/hashlib.py", line 139, in <module>
File "/p/pdsd/Intel_MPI/Software/Python/python-2.7.2-linux-intel64-rhel5.7/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor
ValueError: unsupported hash type sha224
ERROR:root:code for hash sha256 was not found.
Traceback (most recent call last):
File "/p/pdsd/Intel_MPI/Software/Python/python-2.7.2-linux-intel64-rhel5.7/lib/python2.7/hashlib.py", line 139, in <module>
File "/p/pdsd/Intel_MPI/Software/Python/python-2.7.2-linux-intel64-rhel5.7/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor
ValueError: unsupported hash type sha256
ERROR:root:code for hash sha384 was not found.
Traceback (most recent call last):
File "/p/pdsd/Intel_MPI/Software/Python/python-2.7.2-linux-intel64-rhel5.7/lib/python2.7/hashlib.py", line 139, in <module>
File "/p/pdsd/Intel_MPI/Software/Python/python-2.7.2-linux-intel64-rhel5.7/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor
ValueError: unsupported hash type sha384
ERROR:root:code for hash sha512 was not found.
Traceback (most recent call last):
File "/p/pdsd/Intel_MPI/Software/Python/python-2.7.2-linux-intel64-rhel5.7/lib/python2.7/hashlib.py", line 139, in <module>
File "/p/pdsd/Intel_MPI/Software/Python/python-2.7.2-linux-intel64-rhel5.7/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor
ValueError: unsupported hash type sha512
TERM environment variable not set.
Here is the mpitune log, notice that I ran it similar to what was suggested here
27'Nov'13 11:01:34 | MPITune started at 27 November'13 (Wednesday) 09:01:34
27'Nov'13 11:01:34 | MPITune has been started by: ariel
27'Nov'13 11:01:34 | Preparing tuner's components...
27'Nov'13 11:01:34 DBG | Session's ID is 1385542893
27'Nov'13 11:01:34 DBG | MPITuner has been executed by follow command: ' /usr/local/intel/impi/4.1.1.036/bin64/tune/mpitune -d -hf -od odr -avd min -pm hydra'
27'Nov'13 11:01:34 | Initialization of signals handlers...
27'Nov'13 11:01:34 | Start catching signal with code 15 (SIGTERM) ...
27'Nov'13 11:01:34 | Success.
27'Nov'13 11:01:34 | Start catching signal with code 2 (SIGINT) ...
27'Nov'13 11:01:34 | Success.
27'Nov'13 11:01:34 | Initialization of signals handlers completed.
27'Nov'13 11:01:34 DBG | Extracted tuner's executable part of run line: '/usr/local/intel/impi/4.1.1.036/bin64/tune/mpitune'
27'Nov'13 11:01:34 DBG | Parsed command line arguments' dictionary:
{
'avd' : 'min'
'hf' : ''
'pm' : 'hydra'
}
27'Nov'13 11:01:34 DBG | Initialization of configurator object...
27'Nov'13 11:01:34 WRN | Invalid default value ('<redacted>/config.xml') of argument ('config-file').
27'Nov'13 11:01:34 CER | Invalid default value ('<redacted>/options.xml') of argument ('options-file').
27'Nov'13 11:01:34 CER | A critical error has occurred!
Details:
--------------------------------------------------------------------------------
Type : <type 'exceptions.Exception'>
Value : Invalid default value ('<redacted>/options.xml') of argument ('options-file').
--------------------------------------------------------------------------------
27'Nov'13 11:01:34 | Time of work automatic tuning utility is 0h:0m:0s:19ms
27'Nov'13 11:01:34 CER | Error while terminating child processes. Description: 'NoneType' object has no attribute 'DestroyAllChildProcesses'
27'Nov'13 11:01:34 INF | Safe application's termination completed.
27'Nov'13 11:01:34 DBG | Deleting temp files...
27'Nov'13 11:01:34 DBG | Deleting temp files completed.
27'Nov'13 11:01:34 | Time of work automatic tuning utility is 0h:0m:0s:20ms
Seems like the error is related to creating the default config.xml and options.xml?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I get exactly the same string of Python-related error messages as Brian C. (Quote #2, posted Fri, 12/28/2012 - 06:28) when I launch mpitune in application-specific mode.
Additionally, and perhaps related to these error messages, mpitune seems to refuse to launch the application on more than one node, irrespective of the hosts file and command line options that were specified to run on multiple nodes.
Have any fixes been identified at this point?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page