- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I`m using Intel C++ Version 9.1 Build 20061103Z. I`m writing a dirver code and can not link any libs. The simple line of code generate call to __intel_fast_memcpy.
void copyFloat(float const *src, float *dst, int n) {
for (int i=0; i!=n; ++i)
dst = src;
}
Is there way to avoid generation of call to __intel_fast_memcpy ?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, there's a few possibilities. Unfortunately I couldn't a simple way to do exactly what you're asking, but there are a few possiblities. If your concern is the need for dynamic libraries, you can build it statically ('-static' on Linux/MacOS). That will pull what you need into the executable so you won't have any runtime lib dependences. If the resulting code is too large for your needs, then it gets more complicated. You could tryreplacing your loop with a call to __builtin_memcpy(). That seems to work in the trivial case, at least.
Another possibility is to enable the vectorizer with the appropriate -x[KWNPB] switch (see 'icc -help' for more details on -x options). Very likely the vectorizer will vectorize any similar loop, preventing them from being converted to memcpy's.
Do any of those work for you?
Dale
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Why are you doing memory copying (and even worse float values) in the driver?
- Why do you use Intel Compiler at all if you don't want your code to work fast?
- Why don't you call RtlCopyMemory() instead of writing such a function?
What exactly are you trying to do?!?
Hopefully I wll never buy something that needs your driver...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just re-discovered this _intel_fast_memcpy thing today.
void cpy_int(int* src, int* bb_dst, int count)
{
int i = 0;
do
{
dst = src;
} while(++i < count);
}
I call it like this:
int a[4];
cpy_int(&a[0], &a[2], 2);
And the above is translated into a _intel_fast_memcpy which I don't want to see.
As you see, we only copy 2 integers. A function call (and push & pop registers) is killing the performence.
So, how to disable _intel_fast_memcpy code generation?
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
On Linux use -ffreestanding
Jennifer
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
#pragma loop count(2)
to notify the compiler you want optimization for the short loop. However, if it's always count of 2, it would be faster and shorter code simply to write it out.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot, Jennifer.
On a side note, this switch is a global one, which kills _intel_fast_memcpy everywhere.
It would be ideal that Intel comes up with a pragma allowing to turn off individual memcpytranslations, because _intel_fast_memcpy is great in general cases anyways.
Thanks again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Actually there is something already exist.
The "#pragma optimization_level K". intel_fast_memcpy will not be generated at /O1
Try following:
#pragma optimization_level 1
void cpy_int(int* src, int* dst, int count)
{
int i = 0;
do {
dst = src;
} while(++i < count);
}
void cpy_int2(int* src, int* dst, int count)
{
int i = 0;
do {
dst = src;
} while(++i < count);
}
compile cmd: icl /O2 /c t.cpp
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just tried #pragma optimization_level 1 but it didnt work for me though I did the test in a fairly complicated project, not like the above t.cpp.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page