as i'm running a example code in intel mkl directory /opt/intel/composer_xe_2013_sp1.0.080/mkl/examples/spblasc with the following time function:
double time_st = dsecnd();
// function call
double time_end = dsecnd();
double time_avg = (time_end - time_st)/LOOP_COUNT;
printf("Average time: %e secs n", time_avg);
getting different results all the time.
can I get to know why this is happinning. and how to overcome from this problem.
What is your problem size and loop account , and what is your OS and hardware type?
The article may be help :
As currently OS have mutil-tasks runing almost at all times and out of order execution within CPU. So many factors contribute to the performance of an Intel MKL subroutine, such as problem size, memory size, parallelism, states of caches, branch prediction logic, OS resource scheduled and so on. that is why we usually use statistical number as performance measure and described a tips in the article : ignore the time required by the first Intel MKL call.
As @Ying H wrote in his response there are a lot of factors which can contribute to variation in function timing between consecutive measurement and beetwen consecutive test runs.
For example main thread which is running the code of timed function between two runs of tests could have been swapped out by more priviledged thread for varying amount of time.
One time measurement is not meaningful. In this case, statistical number might help. For example, you can run a group of consecutive measurement, ignore the max number and min number, and calculate the standard deviation of this group of number. If the standard deviation is lower than a criteria, then take the average value of the numbers as the run time of the program.
Another thing you might need to consider is the busy level of the computer when you take the measurement. For the same program, when your computer is super busy on other tasks, the run time of this program may be much higher than when the computer is not busy. Then the question is how the compare the results of the two measurement. So you need a benchmark if you want to compare the results.
hi, thank you all for a excellent explanation.
it there any effect on time if i clear the cache memory each time using
command: sync; echo 3 | sudo tee /proc/sys/vm/drop_caches ?
If you are on Windows you can increase thread executing priority to even Real-Time for the period of the timing. By doing this you can decrease the context switching frequency of your thread. This should help to achieve more accurate result of timing measurement.