- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The code below performs very poorly if compiled with ICC.
I am observing a 2.3X slowdown (!) compared to GCC 8.2
#include <stdint.h> void demux1 ( const int8_t * const __restrict__ in, const int h, int8_t * const __restrict__ out) { for (int i = 0; i < h; ++i) out = in[0 + 2 * i]; }
Have a look yourself on godbolt, the issue seems quite obvious,
especially by comparing ICC vs GCC produced assembly.
(-march=core-avx2 -Ofast -DNDEBUG)
Any clue?
- Tags:
- CC++
- Development Tools
- Intel® C++ Compiler
- Intel® Parallel Studio XE
- Intel® System Studio
- Optimization
- Parallel Computing
- Vectorization
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is duplicate topic.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes. How can I remove this post?

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page