Support for Analyzers (Intel VTune™ Profiler, Intel Advisor, Intel Inspector)
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
4561 Discussions

What's goin wrong with this code?


This is the first time I use vTune, to tune a quite complex bit of C-code.
All it does is basically "Calculate x and add it to an unsined char, and clip it to 255", for a lot of pixels.
Because of the complex nature of the code, its hardly possible to optimize it :-/

vTune tells me a lot of time is used for modifying the data itself:

READ/WRITE are bacially pointer-access wrapper macros, clip255 is simple clipping method.

Any ideas why so many cycles are spent here?

Furthermore, is this really the assembler generated for the C code, or does vTune mix things up?
I am only able to read assembler a bit, but clip255 should generate at least some kind of conditional operation like cmov or a compare+jump, but I don't see something like this in the code.

Thank you in advance, Clemens
0 Kudos
1 Reply
Hi Clemens,

First, I doubt if it was caused by compiler optimization options, but it seems that your assembly code makes sense.

Secondary, was it caused by your function calls (WRITE & READ) which are implementedby Macro? So Clockticks distributed on "addl -60(%ebp), %edx" is incorrect.

Canthis problem be repeated on other function call statement?

Isuggest you to verify on Macro issue, or submit a new issue to if you can provide test case to us.

Thanks, Peter