- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using Intel trace collector. For small simulations it works fine but for large runs I get following error:
[20] Intel(R) Trace Collector ERROR: Failed writing buffer to flush file "/tmp/VT-flush-blueDetector_scalasca_itac_x86.rts_0020-008123.dat": No space left on device
From Intel documentation I see that by default traces are written to /tmp and we are supposed to set VT_LOGFILE_PREFIX. But even if I set this environmental variable to directory under lustre file system and pass -x option for mpiexec, I still get the same error.
$ export VT_LOGFILE_PREFIX=/lustre/jhome7/jicg41/jicg4110/some_dir_path
$ LD_PRELOAD=/usr/local/intel/itac/8.1.2.033/itac/slib_impi4/libVT.so mpiexec -x -trace -np 48 ./app_exe
Note:
- with above settings, only first file i.e. app__itac_x86.rts.prot is written to VT_LOGFILE_PREFIX directory
- I am sure that -x option exports all env variables to all mpi processes, I have tested this
Am I missing something?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Pramod,
Thanks for posting. There are actually 2 sets of files being written to different locations when the Intel® Trace Collector is running.
There are the trace files which contain the physical trace information you will later on read using the Intel® Trace Analyzer GUI. Those files are controlled via the VT_LOGFILE_PREFIX env variable and their default location is actually the directory of where you started the job. Those files will generally be written after your application's MPI_Finalize() call.
We also have temporary flush files. Those are files written during execution of the application by the trace collector and are used to store temporary information before the actual trace files are created. The flush files are controlled by the VT_FLUSH_PREFIX env variable. In your case, you need to use this variable (and not VT_LOGFILE_PREFIX) to change their default location (/tmp).
So your script will look like this:
$ export VT_FLUSH_PREFIX=/lustre/jhome7/jicg41/jicg4110/some_dir_path
$ LD_PRELOAD=/usr/local/intel/itac/8.1.2.033/itac/slib_impi4/libVT.so mpiexec -x -trace -np 48 ./app_exe
I hope this helps. Let me know how it goes.
Regards,
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Perfect! working fine now!
Thanks Gergana!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Glad to hear it :) Let me know how you like using the tool.
Regards,
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Looking at small trace files (hundreds of MBs) work fine.
For traces upto few gigabytes, charts->event timeline took 10-20 minutes (those options were just disabled and there is NO indication whether tool is preparing charts etc...it would be nice to have some indication!)
My actual simulation generates ~150GB of traces and it looks like trace analyzer takes very long time to prepare timeline (timelines are disabled, again no indication!)
I know these are very large traces and I am already working on reducing trace sizes from my simulation.
-Pramod
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Pramod,
Thanks for the feedback! We have actually added a progress bar in the latest Intel® Trace Analyzer and Collector 8.1 Update 3 release. I believe you have Update 2. Just look at the attached image and note the oval highlight in purple.
Since you have a valide license, I'll urge you to upgrade.
Also, if you need any advice on applying filters to reduce the trace file size, you can take a loot at the following Intel® Trace Collector filtering article.
Finally, to reduce some of your startup time, you can separately pre-create the cache file for your application's trace file. That's what the trace analyzer uses in subsequent runs to reduce the startup cost. Here's a quick example:
traceanalyzer --cli trace.stf -c0 -w
Once complete, you'll see a trace.stf.cache file created alongsite your original trace.stf. Then open up the Trace Analyzer GUI as you normally would. The GUI will pick up the cache file automatically. More info is available in the CLI section of the Intel® Trace Analyzer Reference Manual.
Hope this helps.
Regards,
~Gergana
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page