- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The ICL's vectorizer seems to be very good, which makes me think whether it makes sense to use IPP (performance primitives) for simple tasks such as
for (int i=0; i<cnt; i++) dst = src1 * src2;
I assume to use SSE2 as base architecture and AVX for dispatching.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Performance library functions incur more startup overhead so might be expected to be most competitive for long vectors.
In cases big enough to use threaded parallel, if parallel compilation isn't practical, performance library may be useful.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you saying that IPP also employs multithreaded processing for these operations?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You have an install option for the parallel IPP which should decide at run time whether to engage multiple threads.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Using IPP is different in that the code written with IPP will automatically take advantage of the cpu capabilities available (including vectorization) which can save time and maintenance cost as well with one optimized path instead of the need to create multiple paths for different streaming extensions for performance for scaling opportunities as well.
_Kittur

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page