Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

Unexpected behavior due to compiler optimization

Constantin_Christman
366 Views
Hello,

I have a simple multithreaded c++ application consisting of 3 threads (I use the SDL library for threading).
Here you see the code of one of the threads:

[bash]int token1;
bool stop_threads;

...

int load_frame(void*)
{
	int k=0;

	while (!stop_threads) {
		
		if (token1 != 0) continue; 

		blit_surface(movie_surface, ghost_frames, 0, 0);		

		if (++k >= NUM_GHOST_FRAMES) k = 0;		

		token1 = 1;
	}

    return 0;
}
...[/bash]


Until stop_threads is true it tests token1. If token1 is zero it copies the movie_surface into the surface buffer ghost_frames.

In debug mode my application runs as expected. However in release mode with /Qipo the while loop is only executed once and the thead becomes immediately terminated.

If I dsiable compiler IPO, then the application works as expected in release build. However, due to the fact that many reasonable optimizations are not performed, the overall performance of the application clearly suffers from disabling IPO.

When defining the token as

volatile int token1;

then the the application works as expected in release build with /Qipo enabled. However I do not understand why the variable token1 has an impact on the execution of the while loop...?

When I look at the report with /Qopt-report:3 I have no idea where to find out what the compiler did to the while loop in order to optimize it. It is a lot text, so I will not post it here - maybe someone can help me where exectly to look at in the report?

In addition to that, what is the right way to handle such a problem? Disabling IPO is not a solution, but maybe ther is a way to tell the compiler (maybe with a pragma) not to optimize a specific code section? Or is using volatile a better solution? Any other ideas?

Thanks in advance!
Constantin
0 Kudos
5 Replies
jimdempseyatthecove
Honored Contributor III
366 Views
Use volatile on stop_threads as well as token1 since apparently these are shared variables. Without volatile, the compiler optimizations might copy thethe variables into registers for use in the control flow of the function. Once these variables are copied into registers, the remaining code of the function will not see changes in the shared variables made by other threads.

You also might consider using atomic class for these shared variables.

Jim Dempsey
0 Kudos
Constantin_Christman
366 Views
Thanks for your reply, Jim.

I understand that it might be important to use volatile for stop_threads also - but this explaines not the behavior of the application in release mode with compiler optimization. If an old version of stop_thread is kept in the register then the threads might run longer but in my program the whole loop seems to be removed by the optimization.

I am searching for a reason for this rigerous optimization and for a way to see in the optimization report, what actually has happend.

Constantin
0 Kudos
JenniferJ
Moderator
366 Views

did you see any warning during compile time?
can you share some more code on how the threads are created? maybe I can create a testcase based on it. or if possible, please create a test that contains this code.

And what version of the intel compiler are you using? try /Qdiag-enable:thread Windows or-diag-enable thread" Linux to see if it emits more diagnostics.

Jennifer

0 Kudos
Constantin_Christman
366 Views
Hi Jennifer,

no, there is no waring at compile time.

I use the Intel C++ Compiler 11.1.

When enabling /Qdiag-enable:thread I get a lot of warings like the following:
[bash]warning #1710: reference to statically allocated variable "stop_threads"
 	while (!stop_threads) {
 	        ^
warning #1710: reference to statically allocated variable "token1"
		if (token1 != 0) continue; [/bash]

However I was not able to create a small test case which repoduces the problem - in fact, now even my application doesn't show the effect any more... one of the minor program modifications I did yesterday now prevents the described over-optimization...

Out of interest, can you tell me how to find out which optimizations were performed by the compiler? Is /Qopt-report:3 the right switch to find out?
Maybe you can provide me a link to a tutorial/documentation, how to read this report?

Thanks
Constantin


0 Kudos
jimdempseyatthecove
Honored Contributor III
366 Views
If stop_threads gets registerized, then your while(!stop_threads) may run forever. (or never).

volatile will not always fix your coding problems, you may also require memory fences and/or memory barriers.

The atomic class is one way to aid in getting the correct behavior.

Jim Dempsey
0 Kudos
Reply