Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

Fast inter--core message passing

arataj
Beginner
217 Views
Hello, what is the fastest way of sending 32- or even 1-bit size data between two threads within a single multi--core CPU?

I tried shared variable and spin lock, it is about twenty times faster than a system semaphore, but still hundreds of cycles or more for a single data exchange.

volatile int buffer;

...

while(buffer != EMPTY) {
asm("nop");
asm("nop");
asm("nop");
asm("nop");
}

The nops appear to make it faster, I guess they reduce some congestion at the cost of very slight delays.

Any help?

Best regards,
Artur
0 Kudos
2 Replies
Dmitry_Vyukov
Valued Contributor I
217 Views

It's the fastest way. I.e. direct write in one thread, and spinning on load in another thread. Yes, it's hundreds of cycles, there is nothing you can do with that (if you need physical movement of data).

There is only 2 options to accelerate it. (1) Schedule both threads to the same core (no physical concurrency in this case), or (2) batch messages - you can physically transfer up to 64 bytes for the same cost.

Btw, you should use PAUSE instruction for spin loops instead of NOP.

0 Kudos
jimdempseyatthecove
Honored Contributor III
217 Views

nops tend to drag down the system, PAUSE is better. (I use PAUSE)
Under some special circumstances MONITOR/MWAIT may be better.

Jim Dempsey

0 Kudos
Reply