- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Code:
Compile commands:
/O3 /Ot /arch:CORE-AVX2
#include <algorithm>
#include <ctime>
#include <iostream>
int main()
{
// Generate data
const unsigned arraySize = 32768;
int data[arraySize];
for (unsigned c = 0; c < arraySize; ++c)
data[c] = std::rand() % 256;
// !!! With this, the next loop runs faster.
std::sort(data, data + arraySize);
// Test
clock_t start = clock();
long long sum = 0;
for (unsigned i = 0; i < 100000; ++i)
{
for (unsigned c = 0; c < arraySize; ++c)
{ // Primary loop.
if (data[c] >= 128)
sum += data[c];
}
}
double elapsedTime = static_cast<double>(clock()-start) / CLOCKS_PER_SEC;
std::cout << elapsedTime << '\n';
std::cout << "sum = " << sum << '\n';
}
icc:0.02
icx:0.22
https://stackoverflow.com/questions/75023152/what-compiler-commands-can-be-used-to-make-gcc-and-icc-compile-programs-as-fast ---------------------------------------------- Here my post on stackoverflow, now I know that icx lacks a lot of optimizations compared to icc.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thank you for posting in Intel Communities.
Could you please add /Ox optimization flag to the command? Which enables maximum optimizations. Please refer to the below link for more details.
We are getting good performance with Intel oneAPI C++ Compiler(icx) than with Intel C++ Classic Compiler(icl/icc) while executing your code.
Please refer to the below screenshot for the output.
Please let us know if you still face any issues.
Thanks and Regards,
Pendyala Sesha Srinivas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
icx has many fewer compile commands than icc, such as
/Qinline-factor-
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Another comparison
#include <iostream>
#include <chrono>
using namespace std;
#define CHRONO_NOW chrono::high_resolution_clock::now()
#define CHRONO_DURATION(first,last) chrono::duration_cast<chrono::duration<double>>(last-first).count()
int fib(int n) {
if (n<2) return n;
return fib(n-1) + fib(n-2);
}
int main() {
auto t0 = CHRONO_NOW;
cout << fib(45) << endl;
cout << CHRONO_DURATION(t0, CHRONO_NOW) << endl;
return 0;
}
icc:0.32
icx:2.35
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We were able to reproduce your issue. We are working on this issue internally.
We will get back to you soon.
Thanks and Regards,
Pendyala Sesha Srinivas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
icc vs icx
https://godbolt.org/z/jPchs3G3b
https://godbolt.org/z/svsE5dxa1
https://godbolt.org/z/zGced8ffn
https://godbolt.org/z/Yjvn9GGan
https://godbolt.org/z/4ja56WcT4
https://godbolt.org/z/WsKEeYnd1
https://godbolt.org/z/KqfKf6a4h
https://godbolt.org/z/hxd8PPjnr
https://godbolt.org/z/raq6hr9Ma
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
#include <iostream>
#include <Windows.h>
#include <nmmintrin.h>
typedef struct _integrity_check
{
struct section {
std::uint8_t* name = {};
void* address = {};
std::uint32_t checksum = {};
bool operator==(section& other)
{
return checksum == other.checksum;
}
}; section _cached;
_integrity_check()
{
_cached = get_text_section(reinterpret_cast<std::uintptr_t>(GetModuleHandle(nullptr)));
}
std::uint32_t crc32(void* data, std::size_t size)
{
std::uint32_t result = {};
for (std::size_t index = {}; index < size; ++index)
result = _mm_crc32_u32(result, reinterpret_cast<std::uint8_t*>(data)[index]);
return result;
}
section get_text_section(std::uintptr_t module)
{
section text_section = {};
PIMAGE_DOS_HEADER dosheader = reinterpret_cast<PIMAGE_DOS_HEADER>(module);
PIMAGE_NT_HEADERS nt_headers = reinterpret_cast<PIMAGE_NT_HEADERS>(module + dosheader->e_lfanew);
PIMAGE_SECTION_HEADER section = IMAGE_FIRST_SECTION(nt_headers);
for (int i = 0; i < nt_headers->FileHeader.NumberOfSections; i++, section++)
{
std::string name(reinterpret_cast<char const*>(section->Name));
if (name != ".text")
continue;
void* address = reinterpret_cast<void*>(module + section->VirtualAddress);
text_section = { section->Name, address, crc32(address, section->Misc.VirtualSize) };
}
return text_section;
}
/// <summary>
/// Checks .text integrity.
/// </summary>
/// <returns>Returns true if it has been changed.</returns>
bool check_integrity()
{
section section2 = get_text_section(reinterpret_cast<std::uintptr_t>(GetModuleHandle(nullptr)));
return (!(_cached == section2));
}
};
int main() {
_integrity_check check;
for (;;) {
std::cout << std::boolalpha << check.check_integrity() << std::endl;
}
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thank you for your patience. The issue raised by you has been targeted to be fixed in oneAPI 2024.0 version which will be released in the coming months.
The icpx compiler is providing good performance improvement than the icpc compiler.
If the issue still persists with the new release, then you can start a new discussion for the community to investigate.
Thanks and Regards,
Pendyala Sesha Srinivas
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page