- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using IPP libraries with dispatching and dynamic linking with a Xeon 5450 in linux (intel 64 package). Although the dispatcher should detect and use the appropriate library, I noticed when profiling my code (with vtune) that the libraries being used have the y8 extension. According to the getting started guide, this extension is for Intel Core 2 Duo processors. The guide mentioned u8 being for the Xeon 5100 family. However, this extension would not be appropriate for me either. A such, I was wondering which extension should be normally chosen by the dispatcher for a Xeon 5450 (maybe there exists a new set of libraries with a new extension)and if thedispatcher does not select it can I specify it manually when compiling as a flags (i.e. -l...) ?
I am asking this because my code does not run much faster when using the IPP libraries so far. When profiling my code I notice that the functions taking up the most time are the loops in the libippsy8.so library (used for ipps_crosscorr) along with a function called _kmp_wait_sleep from the libiomp5.so library (if anyone can tell me if this function is called from any ipp functions I would really appreciate it). I have no I/O in the main part of my program so I don't know why this function would be called.
Any help would be very much appreciated.
I am asking this because my code does not run much faster when using the IPP libraries so far. When profiling my code I notice that the functions taking up the most time are the loops in the libippsy8.so library (used for ipps_crosscorr) along with a function called _kmp_wait_sleep from the libiomp5.so library (if anyone can tell me if this function is called from any ipp functions I would really appreciate it). I have no I/O in the main part of my program so I don't know why this function would be called.
Any help would be very much appreciated.
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
according Intel web site, Xeon X5450 is Core 2 processor (codenamed Harpertown), so I would expect IPP to dispatch u8 libraries for 64-bit environment and v8 for 32-bit environment.
You may check what IPP libraries are dispatched on your system by running IPP demo application (take a look at Help -> About dialog where it should report on libraries dispatched to run)
Regards,
Vladimir
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Vladimir Dudnik (Intel)
Hello,
according Intel web site, Xeon X5450 is Core 2 processor (codenamed Harpertown), so I would expect IPP to dispatch u8 libraries for 64-bit environment and v8 for 32-bit environment.
You may check what IPP libraries are dispatched on your system by running IPP demo application (take a look at Help -> About dialog where it should report on libraries dispatched to run)
Regards,
Vladimir
Seeing as my dispatcher is choosing the y8 libraries, I forced the u8 libraries by compiling with (-L /opt/intel/Compiler/11.0/083/ipp/em64t/sharedlib -lippsu8 -lippmu8 -L /opt/intel/Compiler/11.0/083/lib/intel64 -liomp5 -lm -lpthread) instead of (-L /opt/intel/Compiler/11.0/083/ipp/em64t/sharedlib -lippsem64t -lippmem64t -lippcoreem64t -L /opt/intel/Compiler/11.0/083/lib/intel64 -liomp5 -lm -lpthread). Yet I am not seeing any improvement in performance. Should I include any additional libraries. I am starting to think that something is wrong with the libiomp5 lib as my profiler is spending some time in the sleep_wait function which belongs to that library. Any other suggestions ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Calvin,
Try disabling multi-threading to see what happens.
Go here for some hints:
http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-threading-openmp-faq/
Simplest way to test is add a call to ippSetNumThread(1).
Paul
Try disabling multi-threading to see what happens.
Go here for some hints:
http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-threading-openmp-faq/
Simplest way to test is add a call to ippSetNumThread(1).
Paul
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page