Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
4996 Discussions

What's goin wrong with this code?


This is the first time I use vTune, to tune a quite complex bit of C-code.
All it does is basically "Calculate x and add it to an unsined char, and clip it to 255", for a lot of pixels.
Because of the complex nature of the code, its hardly possible to optimize it :-/

vTune tells me a lot of time is used for modifying the data itself:

READ/WRITE are bacially pointer-access wrapper macros, clip255 is simple clipping method.

Any ideas why so many cycles are spent here?

Furthermore, is this really the assembler generated for the C code, or does vTune mix things up?
I am only able to read assembler a bit, but clip255 should generate at least some kind of conditional operation like cmov or a compare+jump, but I don't see something like this in the code.

Thank you in advance, Clemens
0 Kudos
1 Reply
Hi Clemens,

First, I doubt if it was caused by compiler optimization options, but it seems that your assembly code makes sense.

Secondary, was it caused by your function calls (WRITE & READ) which are implementedby Macro? So Clockticks distributed on "addl -60(%ebp), %edx" is incorrect.

Canthis problem be repeated on other function call statement?

Isuggest you to verify on Macro issue, or submit a new issue to if you can provide test case to us.

Thanks, Peter

0 Kudos