- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can't get good performance with ippsFir_32f() past the point where it starts using FFTs internally (CORRECTION: that is, past the point where FFTs start getting processed in parallel at order 13). I get about 80% wait time and it's all caused by _kmp_launch_worker threads.
I've tried
- ippsSetNumThreads(1)
- kmp_set_blocksize(200) via dll import
Yet I still see multiple kmp threads in Vtune and overall cpu usage is about 75% between 4 cores. What could I be doing wrong here?
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Could you attached a simple file that can show how the function is usded and some profiling result?
For high CPU usage, can you add the following APIs, to reduced OpenMP BLOCK time?
kmp_set_blocktime(0).
Thanks,
Chao
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page