I want to question about the meaning of -data-limit options.
I'm using the command line interface to profile the application with --data-limit=0 option.
The problem is that, the total size of result directory is too big. I have to measure metrics with different environment many times, but I cannot do it because of the limited storage. In documentation, default data-limit is enough in normal case. But I'm not sure about this because the application I'm profiling is the PyTorch python script training complex CNN models.
In has the data preprocessing, data loading, training(forward & backward), validating(only forward) code. And with default data-limit, the measurement ends too early and I don't think this result represents the overall characteristic of my Python script. Could you provide me any tip about this situation?
+) I have one more question about hardware event-based sampling in the collection mode threading. The elapsed time in this mode is about 20% longer than other mode (hotspots, memory access) Why it happened? And when I writes a report about this analysis, which elapsed time will be meaningful to readers do you think?
You can start a collection "paused" and resume it once the program reaches the point you're interested in:
You can do this with APIs or through the GUI. You can start and stop collections with APIs and the GUI as well. Those options may allow you to profile what you're looking for without collecting as much data.
The elapsed time may increase if the collection is causing overhead. If you're only interested in the elapsed time to report a performance measurement, then it would be better to run with a lighterweight collection, like hotspots. You can read more about overhead here https://software.intel.com/en-us/vtune-help-minimizing-collection-overhead