- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am experiencing some SSE instructions, such as __mm_dp_ps and __mm_blend_ps, performance issues.
The problem is when I move from my old combination of _mm_add_ps, _mm_mult_ps and _mm_shuffle_ps to __mm_dp_ps and __mm_blend_ps, the overall performance is about the same, or slightly worse.
Although the total instruction number has reduced in my example. I had 18 multiply, 13 add, 1 movehl, 1 shuffle before. Now I have 14 dot product. 3 add, 6 blend and 2 shuffles.
This is done in a highly repeated loop. The instruction count is for each iteration. My data accumulation is for 6, so I need to use two registries.
The tests are conducted on Linux, with SNB, NHLM and Westmere machines. They all share the same behavior.
Any help? Thanks.
The problem is when I move from my old combination of _mm_add_ps, _mm_mult_ps and _mm_shuffle_ps to __mm_dp_ps and __mm_blend_ps, the overall performance is about the same, or slightly worse.
Although the total instruction number has reduced in my example. I had 18 multiply, 13 add, 1 movehl, 1 shuffle before. Now I have 14 dot product. 3 add, 6 blend and 2 shuffles.
This is done in a highly repeated loop. The instruction count is for each iteration. My data accumulation is for 6, so I need to use two registries.
The tests are conducted on Linux, with SNB, NHLM and Westmere machines. They all share the same behavior.
Any help? Thanks.
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Examine your code to see how the ports are utilized. Intel has a tool that attemptsto do this.Although your new code "uses two registers" it may work faster using more registers as you can often do additional workduring latencies.
Jim Dempsey
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks. Could you let me know the name of this Intel tool?
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page