- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[bash]Hi all,
My program works fine under single-core environment. But if under the following environment:
gcc version 4.1.0 (SUSE Linux)
8 * Intel Xeon CPU E5520 @ 2.27GHz
Occasionly, the latter values are lower than the former values of two sequential TickGet()'s:
(Before return of TickGet(), we recorded values of 'uiIdex', 'lm_uiRollingTick[0]' and 'm_uiRollingTick[1]')
(gdb) call TICK_DebugShow()
uiIdx=1, uiTick[0]=35849, uiTick[1]=35848. -------- last time: get uiTick[1]=35848
uiIdx=0, uiTick[0]=35849, uiTick[1]=35848. -------- former time: get uiTick[0]=35849
uiIdx=0, uiTick[0]=35848, uiTick[1]=35847.
I heard that Intel CPU is conservative and ordered. So what causes this problem and how?
CODE:
TICK runs a thread to update rolling tick(using TickRolling) at regular intervals.
And it provides a interface TickGet() for other threads to get the current ticks.
We use a read buffer m_uiRollingTick[1] to prevent using lock.
unsigned int m_uiRollingTickHigh[2]; unsigned int m_uiRollingTick[2];
volatile unsigned int m_uiTickIndex; int TickGet(unsigned int *puiHigh, unsigned int *puiLow) { unsigned int uiIndex; uiIndex = m_uiTickIndex; *puiHigh = m_uiRollingTickHigh[uiIndex]; *puiLow = m_uiRollingTick[uiIndex]; return 0; } void TickRolling(unsigned int uiMillSec) { unsigned int uiRollingTickAndLost; unsigned int uiLostTicks = uiMillSec/1000; m_uiRollingTickHigh[1] = m_uiRollingTickHigh[0]; m_uiRollingTick[1] = m_uiRollingTick[0]; m_uiTickIndex = 1; uiRollingTickAndLost = m_uiRollingTick[0] + uiLostTicks; if(m_uiRollingTick[0] > uiRollingTickAndLost) { m_uiRollingTickHigh[0]++; } m_uiRollingTick[0] += uiLostTicks; m_uiTickIndex = 0; }
Thanks & Regards
Hyphone
[/bash]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
> But as I ran the DEBUG version, I got a (0,35849) followed by (0,35848).
If you are interested in how exactly it's possible under sequentially consistent memory model, follow me.
Below is a sequence of modifications of the variables during update operation:
[cpp]time index tick[0] tick[1] 0 0 10 5 1 0 10 10 2 1 10 10 3 1 15 10 4 0 15 10[/cpp]
First read operation starts at time=0.
A thread reads index=0 (time=0)
Then time advances to time=3.
Then the thread reads tick[0]=15 (time=3).
Second read operation starts at time=3.
The thread reads index=1 (time=3).
Then the thread reads tick[1]=10 (time=3).
So, indeed, in two consecutive reads under sequentially consistent memory model a thread observes time=15 and then time=10. Welcome to concurrent programming!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For example, TickGet() can read high part from one index, and then low part from another index. Or TickRolling() can set m_uiTickIndex to 1 and then update values at index 1. All accesses are not atomic.
It's not only CPU that reorders accesses, it can be done can a compiler as well.
The code is not working on singlecore CPU as well, you were just lucky.
The easiest thing to do is use atomic 64-bit loads and stores. Then you do not need all that code - just atomically store new value, and atomically read current value.
And do read So what is a memory model? And how to cook it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dmitriy, Thank you for your reply.
Yes, the code may be not so nice, but it works on singlecore.
I checked there is no compiler out-of-order optimizing and I declared m_uiTickIndex as 'volatile'.
Then in my opinion, the Intel CPU will promise the executing order as the program order.
SoTickGet()reads m_uiTickIndex first,and thereis nopartly read.
And TickRolling() updates values at index 1 first, then sets m_uiTickIndex to 1.
PS: I run the program as DEBUG version, and there is no optimizing.
(gdb) disass TickRolling
Dump of assembler code for function TickRolling:
0x08048633
0x08048634
0x08048636
0x08048639
0x08048640
0x08048646
0x08048649
0x0804864b
0x08048650
0x08048652
0x08048655
0x0804865a
0x0804865f
0x08048664
0x08048669
0x08048673
0x08048678
0x0804867b
0x0804867e
0x08048683
0x08048686
0x08048688
0x0804868d
0x08048690
0x08048695
0x0804869a
0x0804869d
0x080486a2
0x080486ac
0x080486ad
End of assembler dump.
And thanks for your recommendation, Iwill read the article carefully.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It only appears to work most of the time.
> I checked there is no compiler out-of-order optimizing
So you intended to never compile it under release configuration, right?
> and I declared m_uiTickIndex as 'volatile'.
Volatile does not work that way. At least you need to declare ALL participating variables as volatile.
> Then in my opinion, the Intel CPU will promise the executing order as the program order.
Not quite. For example, Dekker's algorithm won't work on IA-32/Intel64 without explicit memory fences.
> And TickRolling() updates values at index 1 first, then sets m_uiTickIndex to 1.
For example, it can break in the following way.
Current time is 1,100 (high, low).
A reader reads it as 1,100.
On the next try, the reader reads high part 1. Then, time is changed to 2,50. Then reader reads low part - 50. So the result is 1,50. The time indeed goes back.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
First, I think this code is wrong too.
And I see that if current time is 1,0xffffffff (high, low), one reader reads the high part 1 with index 0.
Then, time is changed to 2, 0. Then the reader reads the low part 0.
So the result is 1,0 and time goes back.
But as I ran the DEBUG version, I got a (0,35849) followed by (0,35848).
I just don't know how this problem come out,that is, what is the program executing flow in a global view.
BTW, ifI use spin_lock in TickGet and TickRolling, theproblem is gone.
The do-while loop also does the same work.
Is that a lock-freedomprotection as mentioned in your article?
intVOS_TickGet(unsigned int *puiHigh,unsigned int *puiLow)
{
unsigned intuiIndex;
do {
uiIndex = m_uiTickIndex;
*puiHigh = m_uiRollingTickHigh[uiIndex] ;
*puiLow = m_uiRollingTick[uiIndex];
}while (uiIndex != m_uiTickIndex);
return 0;
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
> But as I ran the DEBUG version, I got a (0,35849) followed by (0,35848).
If you are interested in how exactly it's possible under sequentially consistent memory model, follow me.
Below is a sequence of modifications of the variables during update operation:
[cpp]time index tick[0] tick[1] 0 0 10 5 1 0 10 10 2 1 10 10 3 1 15 10 4 0 15 10[/cpp]
First read operation starts at time=0.
A thread reads index=0 (time=0)
Then time advances to time=3.
Then the thread reads tick[0]=15 (time=3).
Second read operation starts at time=3.
The thread reads index=1 (time=3).
Then the thread reads tick[1]=10 (time=3).
So, indeed, in two consecutive reads under sequentially consistent memory model a thread observes time=15 and then time=10. Welcome to concurrent programming!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page