Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

icc emits slow __builtin_ctzl assembly

neleai
Beginner
442 Views
Instead of simply emiting bsfq icc generates following sequence:

bsf %r9d, %r10d #201.8
bsf %r8d, %r11d #201.8
addl $32, %r10d #201.8
testl %r8d, %r8d #201.8
cmove %r10d, %r11d #201.8
movslq %r11d, %r11

in following benchmark on i5 gcc is faster(200ms) than icc(270ms)

#include
int main(){
long i,sum;
sum=0;
for (i=1;i<100000000;i++){
sum+=__builtin_ctzl(sum+i);
}
printf("%llin",sum);
}
0 Kudos
0 Replies
Reply