- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I've written a simple reader-writer lock for my own needs and am just after another pair of eyes to identify any issues I may have missed.
The domain this will be used in involves the vast majority being reads with the workload between each read-request being quite significant (ie., there will be little thread-contention)
A few things to note that may or may not be obvious:
#1: A basic lock needs to be aquired for any request (read/write)
#2: Writers are given priority, the "Writers" variable is about pending-writers, not current writers.
For C-style ('this' omitted for clarity)
Cheers,
Matt.
I've written a simple reader-writer lock for my own needs and am just after another pair of eyes to identify any issues I may have missed.
The domain this will be used in involves the vast majority being reads with the workload between each read-request being quite significant (ie., there will be little thread-contention)
A few things to note that may or may not be obvious:
#1: A basic lock needs to be aquired for any request (read/write)
#2: Writers are given priority, the "Writers" variable is about pending-writers, not current writers.
[cpp]struct RWLock { // Attributes Byte Writers; Byte Lock; Byte Readers; Byte Unused; // Methods Void FASTCALL AquireWrite(); Void FASTCALL AquireRead(); Void FASTCALL ReleaseWrite(); Void FASTCALL ReleaseRead(); }; NAKED Void FASTCALL RWLock::AquireWrite() { __asm { // Signal readers to wait. add byte ptr [ecx], 1 // Aquire lock, no readers. mov edx, 1 ALIGN 16 SPIN: xor eax, eax pause lock cmpxchg word ptr [ecx+1], dx jnz short SPIN // Decrement writer-pending count. sub byte ptr [ecx], 1 // Aquired. ret } } NAKED Void FASTCALL RWLock::AquireRead() { __asm { // Aquire lock. mov edx, 1 ALIGN 16 SPIN: xor eax, eax pause lock cmpxchg word ptr [ecx], dx jnz short SPIN // Increment reader count. lock add byte ptr [ecx+2], 1 // Free lock. mov byte ptr [ecx+1], 0 // Aquired. ret } } FORCEINLINE Void FASTCALL RWLock::ReleaseRead() { // Decrement reader count. this->Readers--; } FORCEINLINE Void FASTCALL RWLock::ReleaseWrite() { // Free lock. this->Lock = 0; }[/cpp]
For C-style ('this' omitted for clarity)
[cpp]/* * AquireWrite() */ // Signal readers to wait. Writers++; // Aquire lock when no writers or readers are present. while(Lock != 0 && Readers != 0); Lock=1; // Decrement writer-pending count. Writers--; /* * AquireRead() */ // Aquire lock when no writers are either present or pending. while(Writers != 0 && Lock != 0); Lock=1; // Increment reader count. Readers++; // Free lock Lock=0; [/cpp]
Cheers,
Matt.
Link Copied
6 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - matt.j
[cpp]NAKED Void FASTCALL RWLock::AquireRead()
{
__asm
{
// Aquire lock.
mov edx, 1
ALIGN 16
SPIN:
xor eax, eax
pause
lock cmpxchg word ptr [ecx], dx
[/cpp]
Doesn't it have to be lock cmpxchg word ptr [ecx + 1], dx?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You must use the same address and the same operand size in all atomic RMW operations.
Intel 64 and IA-32 Architectures Software Developer's Manual Volume
3A: System Programming Guide, Part 1
7.1.2.2 Software Controlled Bus Locking
Software should access semaphores (shared memory used for signalling
between multiple processors) using identical addresses *and operand lengths*.
For example, if one processor accesses a semaphore using a word access, other
processors should not access the semaphore using a byte access.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You better align your mutex variable on 4-byte boundary. If it happens to cross cache line boundary it will run very slowly on Intel QPI based system, system-wide quiescence is *very* unpleasant thing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - matt.j
[cpp]NAKED Void FASTCALL RWLock::AquireWrite()
{
__asm
{
// Signal readers to wait.
add byte ptr [ecx], 1
// Aquire lock, no readers.
mov edx, 1
ALIGN 16
SPIN:
xor eax, eax
pause
lock cmpxchg word ptr [ecx+1], dx
jnz short SPIN
// Decrement writer-pending count.
sub byte ptr [ecx], 1
// Aquired.
ret
}
}
[/cpp]
Here is a race on [ecx], you use non-atomic operations 'add byte ptr [ecx]' and 'sub byte ptr [ecx]'. Two concurrent writer can damage [ecx], so that it will never become zero again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - matt.j
[cpp]
FORCEINLINE Void FASTCALL RWLock::ReleaseRead()
{
// Decrement reader count.
this->Readers--;
}
[/cpp]
ReleaseReed() is not atomic wrt 'Readers', you need to use 'lock sub' in ReleaseRead(). Otherwise concurrent readers will corrupt 'Readers'.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - matt.j
Hello,
I've written a simple reader-writer lock for my own needs and am just after another pair of eyes to identify any issues I may have missed.
[...]
I've written a simple reader-writer lock for my own needs and am just after another pair of eyes to identify any issues I may have missed.
[...]
Read this whole thread:
Also, here is the best FIFO rw-spinlock out there:
http://groups.google.com/group/comp.programming.threads/browse_frm/thread/be3871ad661efa73/ac0c61ee539625de
I modified it to block, however you can easily change it back to a spinlock if you simply remove the eventcount logic and add a backoff/yield in the wait loops.
I modified it to block, however you can easily change it back to a spinlock if you simply remove the eventcount logic and add a backoff/yield in the wait loops.
Any thoughts?

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page