Showing results for 
Search instead for 
Did you mean: 

Help: libiomp5md.dll takes 46.22% individual work reported by Visual Studio Profiler

I am developing a C++algorithm with Visual Studio 2008 team right now. The algorithm requires FFT/IFFT operation and is originally implemented at MATLAB, now I haveload the IPP library to Visual Studio to do the FFT work.

The C++ algorithm program is writtenfollowing theSTL standard. It works but I feel it runs very slow, even slower than the MATLAB program. So I use the Visual Studio Team Profiling Tools to check the CPU time used by each module and function and locate the bottleneck. To my surprise, I find libiomp5md.dll itself takes 46.22% individual work (please see attachment for summary report of the profiler), so it is very CPU time consuming. I don't know if libiomp5md.dll is related to Visual Studio Team Profiling Tools, and why it is so time-consuming. Could anybody please help me?

BTW, is it a wise choice to use STD C++ instead of ANSI C to implement the algorithm in terms of the speed? It is my first time to write an algorithm in STD. The main reason behind is to avoid pointers, which are too powerful and may cause troubles.

Thank you in advance.


Functions Causing Most Work
4,253 52.91
4,238 52.72
4,238 52.72
4,196 52.20
3,856 47.97

Functions Doing Most Individual Work
3,715 46.22
std::valarray::operator[](unsigned int)
633 7.88
std::_Construct(double *,double const &)
578 7.19
operator new(unsigned int,void *)
432 5.37
std::_Destroy(double *)
417 5.19
0 Kudos
2 Replies

Hello Johnson,

libiomp5md.dll is an OpenMP library DLL. This library implements multi-threading within the IPP library (and can be used by other applications, as well, to implement multi-threading). You can read a little bit more about OpenMP as it relates to the Intel IPP library by going this IPP KB article:

When OpenMP implements multiple threads of execution it will, in general, attempt to consume all the available processor resources. So the fact that you see the OpenMP DLL consuming large numbers of processor cycles is not unusual or alarming, it probably means that the compute portion of your algorithm is "working hard" to finish the job.

Regarding your question about using STL C++ versus just straight C, I'm afraid I don't know the answer because I'm not an STL user (I'm assuming you're referring to using vectors in the C++ Standard Template Library Containers). There is the possibility that the way the vectors in your STL container are being implemented are causing undue cache misses when using the IPP FFT functions, but I dont' know enough about how you are doing this or how those templates work to say for sure.



Thank you very muchPaul, your answer is very informative.