I like the APS approach, but I have a number of jobs that have large IO times up front that I would like to ignore.
VTune/Amplifier XE has the ability to delay data collection -- from the command line I can include "-start-paused -resume-after 200" to automatically start data collection after 200 seconds.
It looks like it should be possible to hack the "aps.py" script to include these options, but I don't know much about python. Is there an easier/better way to do this?
Currently the abilities of APS on pause/resume are very limited. Start-Paused mode is not supported even if the script is "hacker" - you will see that perf per-process collection that APS is based on will complain that paused mode is not available. (BTW - starting from 2018 BU1 build we moved from "aps.sh" launcher script to "aps" binary to reduce APS startup overhead).
The only thing that is available now is mpi_pcontrol support for traced based MPI and OpenMP metrics that is described here: https://software.intel.com/en-us/aps-user-guide-region-control-with-mpi-pcontrol but this will not work for HW-event based collection part.
We have plans to enable mpi_pcontrol and itt_notify for APS all type of collection, will be sure that start-passed case is covered as well.
Thanks & Regards, Dmitry