Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.
Announcements
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.

Diffs of xchg and cmpxchg to impl. lock

judeyang
Beginner
686 Views
I've read the artical Implementing Scalable Atomic Locks for Multi-Core Intel EM64T and IA32 Architectures at http://www3.intel.com/cd/ids/developer/asmo-na/eng/dc/threading/333935.htm
.
It says that the cmpxchg is just as expensive as the xchg instruction. Some test data were shown, but no reason was given.

Can anybody explain why those 2 instructions have the same cost to implement lock? Does this cost equivalency only exist for some certain scenarios?

Thanks a lot!
0 Kudos
1 Reply
robert-reed
Valued Contributor II
686 Views

Why would they not have similar costs? Both the xchg instruction, which is always locked, and cmpxchg which operates within a read-modify-write cycle in order to achieve atomicity when locked would be dominated by that read-modify-write cycle, The xchg always substitutes its argument with the destination while the cmpxchg sneaks a compare into the memory cycle. As the Software DevelopersManual says about the latter instruction,

This [cmpxchg]instruction can be used with a LOCK prefix to allow the instruction to be executed atomically. To simplify the interface to the processors bus, the destination operand receives a write cycle without regard to the result of the comparison. The destination operand is written back if the comparison fails; otherwise, the source operand is written into the destination. (The processor never produces a locked read without also producing a locked write.)

The memory cycle dominates the time required to use these as lock instructions, so at least I would expect them to take about the same amount of time.

--Robert.

Reply