Community
cancel
Showing results for 
Search instead for 
Did you mean: 
kim__seongik
Beginner
199 Views

Comparison with theoretical peak performance of CPU

Hello !!

 

I did some test with Intel Xeon Silver 4216 @ 2.10GHz.

I used only one core for checking theoretical peak performance of CPU.

I checked the performance of CPU by using some C source code.

 

 

#include <stdio.h>

 

#include "timer.h"

 

int main() {

  timer_init(1);

  long N = 1000000000;

  float s1 = 1;

  float s2 = 1;

  float s3 = 1;

  float s4 = 1;

  float s5 = 1;

  timer_start(0);

  for (long i = 0; i < N; ++i) {

    s1 += 3;

    s2 += 4;

    s3 += 5;

    s4 += 6;

    s5 += 7;

  }

  timer_stop(0);

  printf("time = %f s\n", timer_read(0));

  printf("s1 = %f\n", s1);

  printf("s2 = %f\n", s2);

  printf("s3 = %f\n", s3);

  printf("s4 = %f\n", s4);

  printf("s5 = %f\n", s5);

  timer_finalize();

  return 0;

}

 

 

That code is just floating point process "Plus".

By using that code when put only one float variable the performance of CPU is 0.7GFlops.

However put 10 float variables case the performance of CPU is over the 6 GFlops.


So my questions are below.

1. Why this situation happen?

 

2. When chaning "Plus" to "multiplication " case has also same result. Why ?? I want to know the reason.

 

3. And also changing the float to double the results are also same.
I want to know the reason.. 


THank you for  reading!!

0 Kudos
1 Reply
jimdempseyatthecove
Black Belt
185 Views

Was this an Intel compiler?
If so, what version?
What optimization level did you use?
Did you check the assembly code to see if the code generated was what you thought it would be?

Note, when optimizations are in effect, for the source code provided, the compiler could pre-calculate the results.

Also, depending on several factors, the compiler could vectorize that loop (map s1:sn into vector(s), then perform vector-wise summation).

All of these questions could be answered by looking at the assembly code.

Note, you do not need to master assembly code in order to figure out what is happening inside your loop.

Jim Dempsey

Reply