I writing a program that should be able to read and write bytes from and to memory. To be more precise, I am trying to reach a certain average on the IntelPCM READ/WRITE counters through a C++ program. To increase the counters independently i wrote a simple function that takes a pointer and moves the byte value at the given location into a register (the value is than never used because i do not want to increase the WRITE counter at this point). The function is as follows:
The problem is that it seems the CPU (Xeon E3-1230 v5) does never execute the move operation because the PCM counter for READ stays at 0. The function is called repeatedly with a fixed amount of cycles. The address is from a large array (512MB) that contains random data. Each iteration the pointer is moved 64 bytes (which should be the cache line size) plus a random amount between [0, 1024] to not have regular patterns in accessing the memory.
Can the CPU determine if an instruction is in his opinion unnecessary and retire it without actually executing it? If so can this behavior be switched off somehow or do I miss an important point?
I tried looking into the Software Developers Manual but could not find a functionality that would result in the described behavior or something that would point me in the right direction.