- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello all,
As you will notice, even inside NASA and other institutions
in France (Ariane...) are making errors in there programs...
Read this:
"Although unexpected computer resets and human errors
continue to frustrate NASA/JPL scientists, both Pathfinder
and Sojourner... Current priorities involve fixing
the software bug that is causing Pathfinder to reset its
computer"
"The trouble experienced by the Mars lander "Mars Pathfinder"
is a classic example of problems caused by priority inversion
in realtime systems"
And don't forget about the french Ariane 5 rocket explosion:
http://www.youtube.com/watch?v=z-r9cYp3tTE
So, i have just updated and corrected a problem with
lockfree_mpmc, if you take a look at lockfree_mpmc.pas
inside my lockfree Parallel Queue zip file:
http://pages.videotron.com/aminer/parallelqueue/parallelqueue.htm
You will notice that i have modified the algorithm in the push()
side, and i have used a test like this:
if getlength >= fsize
then
begin
result:=false;
exit;
end;
Now i have tested my new algorithm and it is working very well.
Correctness:
To not make the correctness verification longer, i will concentrate on
the more important parts. If you take a look at the lock-free ParallelQueue
source code , inside lockfree_mpmc.pas you will read this in push side:
function TLockfree_MPMC.push(tm : tNodeQueue):boolean;//stdcall;
var lasttail,newtemp:longword;
begin
if getlength >= fsize
then
begin
result:=false;
exit;
end;
result:=true;
//newTemp:=windows.interlockedincrement(temp);
newTemp:=LockedIncLong(temp);
lastTail:=newTemp-1;
setObject(lastTail,tm);
repeat
if CAS(tail[0],lasttail,newtemp)
then
begin
exit;
end;
sleep(0);
until false;
end;
So, let's say the size of the bounded queue is 1000 , and imagine
that the threads are executing the "if getlength >= fsize " all at the
same time, and imagine that the getlength returns 999, so, the
"if getlength >= fsize" will returns false , and since we have
fSize:=(1 shl aPower) - margin , with a margin of 1000
(margin must be >= to the number of cores) , there will be no
problem(overflow..)...
Now in the pop side, you will read this:
function TLockfree_MPMC.pop(var obj:tNodeQueue):boolean;
var lastHead : longword;
begin
repeat
lastHead:=head[0];
if tail[0]<>head[0]
then
begin
obj:=getObject(lastHead);
if CAS(head[0],lasthead,lasthead+1)
then
begin
result:=true;
exit;
end;
end
else
begin
result:=false;
exit;
end;
until false;
end;
So, as you have noticed, there is a test like this:
if CAS(head[0],lasthead,lasthead+1) , and this test
avoids something like the ABA problem, cause if head
have changed , the CAS() will fail.
I have updated my Threadpool engine with the new lockfree_mpmc,
please download my other modules that use lockfree_mpmc.
Sincerely
Amine Moulay Ramdane.
Link Copied
- « Previous
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When we execute something like a lock instruction in assembler
does the OScan deschedulethe thread even if it isexecuting
inside the lockassembler instruction inside the CAS ?
Nope. Os can interrupt either before LOCK or after LOCK. Modern architectures does not feature interrupts on micro-op level.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dmitry wrote:
>For example, in C/C++ plain loads and stores provide neither atomicity nor memory ordering
I will ask the compiler designer...
When the variables are aligned in 32 bit , are the loads and stored not atomic ?
>memory ordering
Is the CAS (in assembler) that i am usingnot sufficient?
is it necessaryto use the interlocked functions provided by the Windows OS ?
Amine.
> When the variables are aligned in 32 bit , are the loads and stored not atomic ?
It depends on generated machine code.
> Is the CAS (in assembler) that i am usingnot sufficient?
CAS will be sufficient if you use it to do all loads and stores.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Must i use a fence inside my CAS in assembler to force the ordering ?
Most likely yes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, you are not. You are not coding in assembly, so all that does not apply to you. You are coding in Pascal and should consult with Pascal guarantees.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
What is important for us is MEMORY VISIBILITY across threads..
There is nothing you can do with visibility on cache-coherent hardware - it's just as it is. You can only control ordering (that is mutual order of memory accesses as they are seen by different threads).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dmitriy wrote:
>Nope. Os can interrupt either before LOCK or after LOCK. Modern architectures does not feature interrupts >on micro-op level
I mean if you execute something like
1- lock cmpxchg [ecx],edx
or
2- LOCK XADD [ECX], EAX
canthe OSstill deschedule the threadsif it executing
1 or 2 ?
Amine Moulay Ramdane.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dmitriy wrote:
>Nope. Os can interrupt either before LOCK or after LOCK. Modern architectures does not feature interrupts >on micro-op level
I mean if you execute something like
1- lock cmpxchg [ecx],edx
2- LOCK XADD [ECX], EAX
canthe OSstill deschedule the threadsif it executing
1 or 2 ?
OS can't de-schedule a thread in the middle of ANY instruction (only before or after, but not in between).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dmitriy wrote:
>OS can't de-schedule a thread in the middle of ANY instruction (only before or after, but not in between).
Thank you Dmitriy.
Amine Moulay Ramdane.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Dmitriy,
I was thinking more about this, and here it is:
LOCKed instruction includes previous and subsequent full memory fences.
But LOCK may not synchronize non-temporal stores and WC-memory:
Read this:
--
WC is a weakly ordered memory type. System memory locations are not
cached and COHERENCY is NOT ENFORCED by the processor's bus coherency protocol.
Speculative reads are allowed. Writes may be delayed and
combined in the write combining buffer to reduce memory accesses..
What does this really mean? Writes to WC memory are not cached in the typical sense of the
word cached. They are delayed in an internal buffer that is separate from the internal L1 and
L2 caches. The buffer is not snooped and thus does not provide data coherency. The write
buffering is done to allow software a small window of time to supply more modified data to the
buffer while remaining as non-intrusive to software as possible.
----
And about non-Temporal stores read for example this:
---
Non-Temporal SSE instructions (MOVNTI, MOVNTQ, etc.), don't follow
the normal CACHE-COHERENCY rules. Therefore non-temporal stores must
be followed by an SFENCE instruction in order for their results to be seen by other processors in a timely fashion.
---
And
---
Opcode SFENCE
Instruction: SFENCE
Description:
Guranteed that every store instruction that precedes in program
order the store fence instruction is globally visible before any
store instruction follows the fence is globally visible.
---
So, Dmitiy, what do you think if i add in lockfree_mpmc.pas
source code an sfence before the LOCK , like this, i think
it will work:
asm
mov ecx,Target
mov edx,NewValue
mov eax,Comperand
sfence
lock cmpxchg [ecx],edx
JNZ @@2
MOV AL,01
JMP @@Exit
@@2:
XOR AL,AL
@@Exit:
end;
and
function LockedIncLong(var Target: longword): Integer;
asm
{$IFDEF CPU32}
// --> EAX Target
// <-- EAX Result
MOV ECX, EAX
MOV EAX, 1
sfence
LOCK XADD [ECX], EAX
INC EAX
{$ENDIF CPU32}
{$IFDEF CPU64}
// --> RCX Target
// <-- EAX Result
MOV EAX, 1
LOCK XADD [RCX], EAX
INC EAX
{$ENDIF CPU64}
end;
Amine Moulay Ramdane.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was talking about the following thing. For example, in size() you issue 2 separate memory loads for tail and head - they are unordered, most likely compiler is allowed to emit them in reverse order in machine code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dmitriy wrote
>think that it's the responsibility of a guy who issues non-temporal stores/uses WC memory to issue trailing >SFENCE.
Anyway, I have added an sfence before the lock, and it doesn't change
anything to the performance (i have tested it)
>I was talking about the following thing. For example, in size()
>you issue 2 separate memory loads for tail and head - they
>are unordered, most likely compiler is allowed to emit them in reverse order in machine code
In wich method , getlength?
Butin the Intel x86 memory model, Loads are not reordered with other loads.
What do you mean ?
Amine Moulay Ramdane.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Amine Moulay Ramdane.
> In wich method , getlength?
yes.
> Butin the Intel x86 memory model, Loads are not reordered with other loads. What do you mean ?
Once again, you are not coding in x86 assembly so these rules has nothing to do with your situation. You are coding in Pascal, does Pascal guarantees that loads are not reordered with other loads? I think that not.
Consider, Pascal compiler reverses the loads in machine code, then x86 processor COMPLETELY HAS NO MEANS to restore original program order, it does even not know what it was.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>> In wich method , getlength?
>yes.
Ok
> Butin the Intel x86 memory model, Loads are not reordered with other loads. What do you mean ?
>Once again, you are not coding in x86 assembly so these rules has nothing to do with your situation. You >are coding in Pascal, does Pascal guarantees that loads are not reordered with other loads? I think that not.
>Consider, Pascal compiler reverses the loads in machine code, then x86 processor COMPLETELY HAS NO >MEANS to restore original program order, it does even not know what it was.
Are we not getting too schizophrenic here ?
But imagine that the too loads are reordered, wthat will be the problem in getlength() ?
i don't thing that there will be a problem...
Do you see anyproblem ifthe compiler reorders the loads ?
Amine Moulay Ramdane.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No. Optimizing compilers do that on a regular basis.
> Do you see anyproblem ifthe compiler reorders the loads?
There are several such loads and stores, not only that two. It's basically impossible to analyze for a human.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>> Are we not getting too schizophrenic here ?
>No. Optimizing compilers do that on a regular basis.
So i will take a look at the assembler source code of Freepascal
and delphi and see what's going on ..
But even if i take a look at the current assembler source code of
freepascal and delphi, does thatwarranty that it will be ok
with future Delphi and Freepascal compilers ?
Amine Moulay Ramdane.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
freepascal and delphi, does thatwarranty that it will be ok
with future Delphi and Freepascal compilers ?
Consider, you want to cross a highway. You look at one direction, in another - no cars. You safely cross the highway. Now, is it safe to cross the highway everyday w/o looking in both directions?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dmitriy wrote:
>Consider, you want to cross a highway. You look at one direction,
>in another - no cars. You safely cross the highway. Now, is it safe
>to cross the highway everyday w/o looking in both directions?
I understand.
And don't *FORGET*to look at both directions, before crossing a highway !
But even if you say it Dmitriy, is it still safe at 100%? i mean is it
possible to be sure at 100% that we will not *FORGET*
to look at both directions ? :)
Amine Moulay Ramdane.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I wrote:
>I understand.
>And don't *FORGET*to look at both directions, before crossing a highway !
>But even if you say it Dmitriy, is it still safe at 100%? i mean is it
>possible to be sure at 100% that we will not *FORGET*
>to look at both directions ? :)
>Amine Moulay Ramdane.
But anyway,I am here and WEare here to REMEMBERourselves,
and one another,tonot*FORGET* to look at both directionsina highway,
rigth ?
Take care Dmitriy, you are such a nice guy...
Amine Moulay Ramdane.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And don't *FORGET*to look at both directions, before crossing a highway !
But even if you say it Dmitriy, is it still safe at 100%? i mean is it
possible to be sure at 100% that we will not *FORGET*
to look at both directions ? :)
Well, taking into account that there are bugs in compilers, OS and even in hardware, it's impossible to be 100% safe. However at least one can make his best to ensure that there are no problems on *his* side. The good way to do this, for example, in C/C++ world is to stick to guarantees provided by ISO C/C++ and POSIX standards; then at least possible problems are on their side (for example, implementers of C/C++ language compiler).

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
- Next »