SMP locking

rkmanikanta · ‎08-13-2010

Hi,

I would like to know more information on implementing a SMP lock?

Does it have any HW dependancy?

What happens if two processorts try to execute same instruction in SMP case? How do we achieve locking in this case when only one OS present in memory?

Thanks,

Mani

Aubrey_W_ · ‎08-17-2010

Hello Mani,

Here are some articles that might be helpful:
http://software.intel.com/en-us/articles/effective-implementation-of-locks-using-spin-locks/
http://www.embeddedintel.com/special_features.php?article=240

Let me know if you were looking for something else.

==
Aubrey W.
Intel Software Network Support

ClayB · ‎08-17-2010

Mani -

A simple lock can be implemented with the atomic Compare-and-Swap (CAS) instruction. To lock, threads compare the lock value to '1' and swap the value of '0' into the lock if the comparison succeeds. Other threads attempting the same operation after a successful locking will find the lock value == 0 and must spin until the value changes back to 1. To unlock, a thread simply stores a '1' back inlock variable (or uses CAS to compare with '0' and swap in a '1' on success).

Since the operation is atomic, only one thread at a time can execute the instruction at a time to completion.

There is such an instruction on the Itanium processor. I don't believe any other IA chip has this (but I may be wrong). There is a Windows intrinsic that can be called to execute a CAS atomically (InterlockedCompareAndExchange).

Hope something here helps.

--clay

rkmanikanta · ‎08-18-2010

Hi Aubrey,

Thanks for the links.

Rgds,

Mani

rkmanikanta · ‎08-18-2010

Hi Clay,

I have a question:

For SMP locking, does HW support is necessary? If not, how can SW does it?

Can Normal Semaphore will do job for me?

Thanks,

Mani

Dmitry_Vyukov · ‎08-18-2010

Quoting rkmanikanta

For SMP locking, does HW support is necessary? If not, how can SW does it?
Can Normal Semaphore will do job for me?

It's impossible to implement locking w/o hardware support in some form. But every SMP capable hardware has to have such support.

jimdempseyatthecove · ‎08-18-2010

Mani,

In addition to CAS you can also use and atomic Swap or and atomic Add/Increment

The following is a non-fair lock

volatile long YourLock = 0;
...
// lock
while(InterlockedExchange(&YourLock, 1) != 0)
_mm_pause();
...
// unlock
YourLock = 0;

---------------------------------
The following is a fair lock

struct FairLock
{
volatile long A;
volatile long B;
FairLock() { A = B = 0;};
~FairLock() { Lock(); }
void Lock()
{
long myTurn = InterlockedExchangeAdd(&A, 1);
while(B != myTurn)
_mm_pause();
}
void Unlock()
{
++B;
}
}
struct FairLockLock
{
FairLock* aFairLock;
FairLockLock(FairLock* _aFairLock)
{
aFairLock = _aFairLock;
aFairLock->Lock();
}
~FairLockLock() { aFairLock->Unlock(); }
}

...
FairLock YourLock;
...
{
FairLockLock lock(&YourLock);
... code runs holding lock on YourLock
} // dtor unlocks lock on YourLock

(above is untested code)

There are many ways to perform locks
The above methods are suitible for short held locks (using _mm_pause())
For longer held locks you may wish to consider using Yield() or other less CPU hogging methods.

Jim Dempsey

Chris_M__Thomasson · ‎08-31-2010

FWIW, you can do a simple mutex with atomic swapand a binary semaphore. Here is some pseudo-code, memory barriersomittedfor clarity:

[bash]struct mutex
{
    atomic_word m_state; // = 0
    binary_semaphore m_waitset;


    void lock()
    {
        if (ATOMIC_SWAP(&m_state, 1))
        {
            while (ATOMIC_SWAP(&m_state, 2))
            {
                m_waitset.wait();
            }
        }
    }


    void unlock()
    {
        if (ATOMIC_SWAP(&m_state, 0) == 2)
        {
            m_waitset.post();
        }
    }
};[/bash]

You can also create a very nifty bakery-style read/write spinlock using atomic fetch-and-add. Joe Seigh created this extremely neatalgorithm; memory barriersomittedfor clarity:

[bash]struct rwspinlock { 
    enum constant 
    { 
        READ_ACCESS  = 0x10000, 
        WRITE_ACCESS = 1 
    }; 


    atomic_word m_next;     // = 0 
    atomic_word m_current;  // = 0 


    bool prv_check_read(atomic_word ticket) 
    { 
        return (ticket == (ATOMIC_LOAD(&m_current) % READ_ACCESS); 
    } 


    bool prv_check_write(atomic_word ticket) 
    { 
        return (ticket == ATOMIC_LOAD(&m_current)); 
    } 


    void rdlock() 
    { 
        atomic_word ticket = ATOMIC_FAA(&m_next, READ_ACCESS) % READ_ACCESS; 
        while (! prv_check_read(ticket)) cpu_yield();
    } 


    void rdunlock() 
    { 
        ATOMIC_FAA(&m_current, READ_ACCESS); 
    } 


    void wrlock() 
    { 
        atomic_word ticket = ATOMIC_FAA(&m_next, WRITE_ACCESS); 
        while (! prv_check_write(ticket)) cpu_yield();
    } 


    void wrunlock() 
    { 
        ATOMIC_FAA(&m_current, WRITE_ACCESS); 
    } 
}; [/bash]