- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I'm evaluating the performance of IPP 7.0, which is giving us great improvements over our current (Blitz) baseline.
However, I'm seeing some strange IPP vs IPP benchmarking results (see below). Win32 is performing (substantially) better than x64.
Setup: VS2010, Corei7 2.0 GHz, MSVC optimization options enabled. Times are in seconds, for 2000 iterations.
What might cause this? Are the IPP 7.0 routines not optimized for x64???
Thanks,
- Chris
================================================================================
Conv. Type Main Filter TypeWIN32 x64 Routines
2D 190x190 9x9 float 0.87 1.61 ippiConvFull_32f_C1R
short0.62 1.58 ippiConvFull_16s_C1R
Separable 2x1D 190x190 2 9x1 float 0.23 0.33 ippiFilterColumn_32f_C1R
ippiFilterRow_32f_C1R
short0.18 0.48 ippiFilterColumn_16u_C1R
ippiFilterRow_16u_C1R
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, I had a long post written detailing how the timings were 3x-4x worse on x64 than Win32, but then I found the problem.
For Win32, I was linking with "default linking method".
For x64, I was linking with "single-threaded static library".
I changed x64 to "default linking method", and now it's essentially as fast as Win32.
For your copy test, you may want to look into _aligned_malloc(N, 32) (and _aligned_free), if you're not already doing so.
- Chris
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the updates. Ifregarding the big topicwhich is faster between 32bit and 64bit, it may involve many discussions on hardware, cachesize, register , memory, problem size etc. Fromthe perspective of performance result IPPI, we don't assume the 64bitare alwaysfaster than32bit. Just say, some times, Ais gooder than B,some times B is good than B. But basically, they arealmostsame fast. Ifthere is bigdifference (i.e above 20%), thenit is valuable to investigatefurther.
IPP provide binarytest toolsfor most of IPPI functions and here is guide article for your reference:
Using the performance tool to measure Intel IPP Function performance
Best Regards,
Ying
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page