Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

Pthread Read/Write Locks

ClayB
New Contributor I
874 Views
Recently, I've seen the specifications for pthread_rwlock_* set of functions. I always knew Read/Write locks were possible to construct by hand and Butenhof's Programming with POSIX Threads (1996) mentions the POSIX.1j API. While many different systems support this functionality (e.g., Compaq Tru64, FreeBSD, Solaris, Win32 Pthreads, Linux), any documentation that I've seen only mentions conformity of the API to version 2 of the Single UNIX Specification (SUSv2). But, all this documentation seem to have been written in 1998 or so. Has there been official adoption of these functions into the POSIX standard?

Has anyone had occasion to use the Read/Write locks API? Was the experience good or bad? What about performance? Looking for more info on this topic, I came across a performance study that reported R/W locks took about twice as long over simple mutex. Would the advantage of having multiple readers in the protected region of code offset this extra time needed to use R/W locks?

-- clay
0 Kudos
3 Replies
Intel_C_Intel
Employee
874 Views
> Would the advantage of having multiple readers in
> the protected region of code offset this extra time
> needed to use R/W locks?

It depends on what you need to protect...

If your need to sync access to a hash list where iteration by other threads was common, but pushes and pops were not... A read/write lock protecting each bucket in the hash list would perform better than a normal lock would.

If the collection was getting heavy pushes and pops, you would use a normal lock.


> Looking for more info on this topic, I
> came across a performance study that reported R/W
> locks took about twice as long over simple mutex.

It is probably due to how the read/write lock was constructed. A lot of them force read access through a normal lock, to sync internal state. This is not good.

There is no need for concurrent read access to block on internal read/write lock state, unless it has to wait for a write.

They should run the performance test again, using a lock-free read/write mutex. Concurrent reads will never block on the mutex due to a simple lock-free algo that is compatible with x86, PowerPC, x86-64, PowerPC64, ect...
0 Kudos
jseigh2
Beginner
874 Views
You have to check the individual platforms to see what level of Posix it's at. Also, some of the bits and pieces of Posix tend to be optional for implementation.

The amount of benefit you'd get from rwlocks depends on your application's characteristics. If you have lots of read only access of more than brief duration, you'd get some benefit. If you have frequent write access, less benefit.

Also depends on rwlock implementation, whether it's FIFO, writer preference, or reader preference.

There's nothing inherent in a rwlock that requires it to be less efficient than a mutex. You're just seeing less than optimal implementation (there is apparently no shortage of people willing to do that). Some of the (reader) lock-free rwlocks aren't too bad with from very little to zero overhead for read access. One of them, RCU, is what Linux used for a major scalability improvement.
0 Kudos
ClayB
New Contributor I
874 Views
I had not thought of the implementation issue as being a real issue. (I assume implementors have done a decent job.) Certainly there are good and bad ways (or lazy) ways to do this, but I was thinking more of the requirements of the operation being a detriment. That is, when a read lock request is entered, the operation would not merely check to see if the lock was currently held, but would need to also determine what kind of lock was being held. If a reader, then the requesting thread would be allowed entry; if a writer, the requesting thread would be blocked.

There just seemed to be more bookkeeping and checking that needs to be done for any implementation that I could well imagine a doubling of time needed to execute the operations. I'm thinking that this would apply to even a lock-free implementation. I'll have to look into the RCU implementation and maybe run my own set of tests to determine what extra overhead might be involved.

-- clay
0 Kudos
Reply