Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Very Simple Concurrency/Coherence Question

Ellis_H_
Beginner
305 Views

Hi all,

In reading the memory ordering section of Intel's Combined Software Developer's manual located here:

https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf

Volume 3, Chapter 8, Section 8.2.3.5 (Page 2,115 in that PDF) states:

[Intra-Processor Forwarding is Allowed]: The memory-ordering model allows concurrent stores by two processors to be seen in different orders by those two processors; specifically, each processor may perceive its own store occurring before that of the other.

This has always made sense to me in the reference of separate memory locations (as their example shows).  However, what if Processor 0 and Processor 1 both issued stores to the same location but with differing values.  IE:

[logical processor 0]: mov [_x], 1

[logical processor 1]: mov [_x], 2

Literally, the above noted statement would allow for the possibility that [logical processor 0] sees 2 in _x, and [logical processor 1] sees 1 in _x.  Obviously cache coherency is designed to not allow that to happen, and I'm sure at the low level this can be explained away in terms of MESI, but is there a section in the manual(s) that outlines this case and specifically states/ensures that both logical processors will come to a coherent value (after store forwarding, etc. happens)?

The manual is so detailed and helpful that I am sure I am missing something.  References would be insanely appreciated as my OCD would certainly be calmed with an official statement that a LOCK prefix isn't needed to ensure total ordering or some other such odd thing in this case.

Thanks in advance to everyone.

-Ellis

 

0 Kudos
1 Reply
QIAOMIN_Q_
New Contributor I
305 Views

storing to the same location without lock would cause a race condition ,this behavior has got a nature of underterminstic ,to make your code behave as you intend ,syncronization is needed .

0 Kudos
Reply