Memory Barrier for reader

rmuthukrishnan · ‎04-22-2009

Hi All:

I checked the forum, but could not understand the answer to this problem:

I have a writer thread runnning on one CPU that writes message into a FIFO (mailbox) and increments the write pointer.
The reader thread running on another CPU that compares its own read pointer with the write pointer of the writer threadand if unequal reads the message from the FIFO.

Now, it can happen that the reader thread sees the new write pointer, but old message instead of the new message because itgot the cache update for the write pointer and not the message data. To avoid this, I understand we can use lfence. .

My questions:
1. Is lfence correct thing to be done?
2. How does it ensure that the load of the message comes from the memory? From the description it seems that lfence only ensures loads preceeding lfence are globally visible.

Thank you,
Raman

Dmitry_Vyukov · ‎04-22-2009

Quoting - rmuthukrishnan

My questions:
1. Is lfence correct thing to be done?
2. How does it ensure that the load of the message comes from the memory? From the description it seems that lfence only ensures loads preceeding lfence are globally visible.

1. No, LFENCE is not required. x86 memory model is strong enough, so that no memory fence is required in your case.
Also note that memory ordering is always a game of two, so producer also has to execute "abstract" memory fence. But once again x86 memory model is strong enough so that producer's fence is also no-op.

2. Memory fences have nothing to do with "load comes to memory". Memory fences are solely about relative ordering of memory accesses. What you are talking about is handled by cache-coherency mechanism. I.e. load may (and will) go to the processor cache (no to the main memory), however processor will still see correct (relevant) value.

You may check out following article:
http://software.intel.com/en-us/articles/single-producer-single-consumer-queue/

rmuthukrishnan · ‎04-22-2009

Quoting - Dmitriy Vyukov

You may check out following article:
http://software.intel.com/en-us/articles/single-producer-single-consumer-queue/

Thank you very much Dmitriy. One question: I do see __memory_barrier() being used as compiler fence in the example. So, I understand we need compiler fence so that message data readdoes not happen before write pointer read. Is that right? Do you know the equivalent for GCC compiler in linux?

Dmitry_Vyukov · ‎04-22-2009

Quoting - rmuthukrishnan

Thank you very much Dmitriy. One question: I do see __memory_barrier() being used as compiler fence in the example. So, I understand we need compiler fence so that message data readdoes not happen before write pointer read. Is that right? Do you know the equivalent for GCC compiler in linux?

Yes, compiler fence is required to ensure that memory accesses are in correct order in the compiled machine code.
For gcc you may use following construct as compiler fence:

#define compiler_fence() __asm__ __volatile__ ("" : : : "memory")