Software Archive
Read-only legacy content
Announcements
FPGA community forums and blogs have moved to the Altera Community. Existing Intel Community members can sign in with their current credentials.
17060 Discussions

rep; Nop / -asm pause

mark-whitener
Beginner
1,743 Views
On Xeon or Hyper-Threading tech, what isthe cycle time for rep; nop or _asm pause? The documentation hints that it can be anywhere from a nop to some definite value. So what is it?
Thanks in advance
0 Kudos
3 Replies
Intel_Software_Netw1
1,743 Views
We are forwarding your question to our engineering contacts and will let you know how they respond.
Regards,

Lexi S.

IntelSoftware NetworkSupport

http://www.intel.com/software

Contact us

Message Edited by intel.software.network.support on 12-02-2005 01:14 PM

0 Kudos
Intel_Software_Netw1
1,743 Views
Here is the response we received from our Application Engineers:
NOP instruction can be between 0.4-0.5 clocks and PAUSE instruction can consume 38-40 clocks. Please refer to the whitepaper on how to measure the latency and throughput of various instructions. The REPE instruction comes in various flavors and the latency/throughput of each of them varies. Please also see below for the sample code to measure the average clocks.
#include
#define ReadTSC( x ) __asm cpuid
__asm rdtsc
__asm mov dword ptr x,eax
__asm mov dword ptr x+4,edx
#define LOOP_COUNT 160000.
#define REPEAT_25( x ) x x x x x x x x x x x x x x x x x x x x x x x x x
#define REPEAT_100(x) REPEAT_25(x) REPEAT_25(x) REPEAT_25(x) REPEAT_25(x)
#define REPEAT_1000(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x)
#define FACTOR ((double)LOOP_COUNT*1000.0)
#define CLOCKSPERINSTRUCTION(start,end) ((double)end-(double)start)/(FACTOR)
void main(int argc,char **argv)
{
__int64 start, end,total;
total = 0;;
ReadTSC(start);
for (int i=0; i
{
REPEAT_1000(__asm { nop};)
}
ReadTSC(end);
total = end-start;
printf("nop: clocks per instruction %4.2f ",(double)total/(double)FACTOR);
ReadTSC(start);
total = 0;;
for (int i=0; i
{
REPEAT_1000(__asm { pause};)
}
ReadTSC(end);
total = end-start;
printf("pause: clocks per instruction %4.2f ",(double)total/(double)FACTOR);
}
==
Regards,

Lexi S.

IntelSoftware NetworkSupport

http://www.intel.com/software

Contact us

Message Edited by intel.software.network.support on 12-02-2005 01:14 PM

mark-whitener
Beginner
1,743 Views
Thank you. The documentation states the results vary as does the answer. I'll try this on our various 8x, 16x and32x machines and see what the variance is.
Thanks again.
0 Kudos
Reply