Intel® C++ Compiler
Support and discussions for creating C++ code that runs on platforms based on Intel® processors.
7694 Discussions

Is there a performance difference between these two loops?

HD86
Novice
663 Views

The following two code samples do the same thing, but is one of them more efficient than the other?

void foo(char *p) 
{
	while (*p != 0)
	{ 
		*p = 0; 
		++p;
	}
}
void foo(char *p)
{
	char *pBeginning = p;
	while (*p != 0) { ++p; }
	char *pEnd = p;
	p = pBeginning;
	while (p < pEnd) 
	{
		*p = 0;
		++p;
	}
}

 

I am thinking that the second sample may be more efficient, because the length of the second loop is known before it starts and the CPU can parallelize it. In the first sample parallelization is never possible because the length of the loop cannot be known in advance. Is this thinking correct?

Labels (2)
0 Kudos
1 Solution
jimdempseyatthecove
Black Belt
642 Views

Right, not only because of length known vs parallelization, but also it removes the requirement to read the destination array.

Knowing the length of the ray seams a requirement for vectorization of the first loop, but it is not necessarily a requirement.

loop:
   read a vector worth of data
   if any are zero exit loop
   zero vector worth of data
 end loop
zero lanes of vector until 1st lane with 0

Jim Dempsey

View solution in original post

2 Replies
jimdempseyatthecove
Black Belt
643 Views

Right, not only because of length known vs parallelization, but also it removes the requirement to read the destination array.

Knowing the length of the ray seams a requirement for vectorization of the first loop, but it is not necessarily a requirement.

loop:
   read a vector worth of data
   if any are zero exit loop
   zero vector worth of data
 end loop
zero lanes of vector until 1st lane with 0

Jim Dempsey

AbhishekD_Intel
Moderator
582 Views

Hi,


Thanks for accepting the solution. We are no longer monitoring this thread. Please post a new thread if you have any issues.



Warm Regards,

Abhishek


Reply