topic SMP locking in Intel® Moderncode for Parallel Architectures

SMP locking

rkmanikanta — Fri, 13 Aug 2010 14:22:45 GMT

Hi,

I would like to know more information on implementing a SMP lock?

Does it have any HW dependancy?

What happens if two processorts try to execute same instruction in SMP case? How do we achieve locking in this case when only one OS present in memory?

Thanks,

Mani

SMP locking

Aubrey_W_ — Tue, 17 Aug 2010 22:21:30 GMT

Hello Mani,

Here are some articles that might be helpful:
http://software.intel.com/en-us/articles/effective-implementation-of-locks-using-spin-locks/
http://www.embeddedintel.com/special_features.php?article=240

Let me know if you were looking for something else.

==
Aubrey W.
Intel Software Network Support

SMP locking

ClayB — Tue, 17 Aug 2010 22:34:46 GMT

Mani -

A simple lock can be implemented with the atomic Compare-and-Swap (CAS) instruction. To lock, threads compare the lock value to '1' and swap the value of '0' into the lock if the comparison succeeds. Other threads attempting the same operation after a successful locking will find the lock value == 0 and must spin until the value changes back to 1. To unlock, a thread simply stores a '1' back inlock variable (or uses CAS to compare with '0' and swap in a '1' on success).

Since the operation is atomic, only one thread at a time can execute the instruction at a time to completion.

There is such an instruction on the Itanium processor. I don't believe any other IA chip has this (but I may be wrong). There is a Windows intrinsic that can be called to execute a CAS atomically (InterlockedCompareAndExchange).

Hope something here helps.

--clay

SMP locking

rkmanikanta — Wed, 18 Aug 2010 10:04:58 GMT

Hi Aubrey,

Thanks for the links.

Rgds,

Mani

SMP locking

rkmanikanta — Wed, 18 Aug 2010 10:06:17 GMT

Hi Clay,

I have a question:

For SMP locking, does HW support is necessary? If not, how can SW does it?

Can Normal Semaphore will do job for me?

Thanks,

Mani

SMP locking

Dmitry_Vyukov — Wed, 18 Aug 2010 10:21:11 GMT

Quoting rkmanikanta

For SMP locking, does HW support is necessary? If not, how can SW does it?
Can Normal Semaphore will do job for me?

It's impossible to implement locking w/o hardware support in some form. But every SMP capable hardware has to have such support.

SMP locking

jimdempseyatthecove — Wed, 18 Aug 2010 13:14:50 GMT

Mani,

In addition to CAS you can also use and atomic Swap or and atomic Add/Increment

The following is a non-fair lock

volatile long YourLock = 0;
...
// lock
while(InterlockedExchange(&YourLock, 1) != 0)
_mm_pause();
...
// unlock
YourLock = 0;

---------------------------------
The following is a fair lock

struct FairLock
{
volatile long A;
volatile long B;
FairLock() { A = B = 0;};
~FairLock() { Lock(); }
void Lock()
{
long myTurn = InterlockedExchangeAdd(&A, 1);
while(B != myTurn)
_mm_pause();
}
void Unlock()
{
++B;
}
}
struct FairLockLock
{
FairLock* aFairLock;
FairLockLock(FairLock* _aFairLock)
{
aFairLock = _aFairLock;
aFairLock->Lock();
}
~FairLockLock() { aFairLock->Unlock(); }
}

...
FairLock YourLock;
...
{
FairLockLock lock(&YourLock);
... code runs holding lock on YourLock
} // dtor unlocks lock on YourLock

(above is untested code)

There are many ways to perform locks
The above methods are suitible for short held locks (using _mm_pause())
For longer held locks you may wish to consider using Yield() or other less CPU hogging methods.

Jim Dempsey

SMP locking

Chris_M__Thomasson — Wed, 01 Sep 2010 01:05:47 GMT

FWIW, you can do a simple mutex with atomic swapand a binary semaphore. Here is some pseudo-code, memory barriersomittedfor clarity:

[bash]struct mutex
{
    atomic_word m_state; // = 0
    binary_semaphore m_waitset;


    void lock()
    {
        if (ATOMIC_SWAP(&m_state, 1))
        {
            while (ATOMIC_SWAP(&m_state, 2))
            {
                m_waitset.wait();
            }
        }
    }


    void unlock()
    {
        if (ATOMIC_SWAP(&m_state, 0) == 2)
        {
            m_waitset.post();
        }
    }
};[/bash]

You can also create a very nifty bakery-style read/write spinlock using atomic fetch-and-add. Joe Seigh created this extremely neatalgorithm; memory barriersomittedfor clarity:

[bash]struct rwspinlock { 
    enum constant 
    { 
        READ_ACCESS  = 0x10000, 
        WRITE_ACCESS = 1 
    }; 


    atomic_word m_next;     // = 0 
    atomic_word m_current;  // = 0 


    bool prv_check_read(atomic_word ticket) 
    { 
        return (ticket == (ATOMIC_LOAD(&m_current) % READ_ACCESS); 
    } 


    bool prv_check_write(atomic_word ticket) 
    { 
        return (ticket == ATOMIC_LOAD(&m_current)); 
    } 


    void rdlock() 
    { 
        atomic_word ticket = ATOMIC_FAA(&m_next, READ_ACCESS) % READ_ACCESS; 
        while (! prv_check_read(ticket)) cpu_yield();
    } 


    void rdunlock() 
    { 
        ATOMIC_FAA(&m_current, READ_ACCESS); 
    } 


    void wrlock() 
    { 
        atomic_word ticket = ATOMIC_FAA(&m_next, WRITE_ACCESS); 
        while (! prv_check_write(ticket)) cpu_yield();
    } 


    void wrunlock() 
    { 
        ATOMIC_FAA(&m_current, WRITE_ACCESS); 
    } 
}; [/bash]