Topics on semaphore

A couple of days ago, I had a phone interview with Cisco. During the interview, I had been asking several questions about multi-threads programming, which embarrassed me a lot. After review the knowledge of Operating System, I realize several basic concepts about semaphore to mark.

1. What is semaphore?

In Wikipedia, semaphore is defined as a variable or data structure that provides simple but useful abstraction for controlling access by multiple processes to a common resource in a parallel programming environment.

We cannot talk about semaphore without mutex? So,

2. What is mutex?

Mutex is a way of mutual exclusion referring to the problems of ensuring that no two processes or threads can be in their critical section at same time. Here, a critical section refers to a period of time when process accesses a shared resource, such as shared memory.

In a word, mutex is a key that only one thread can hold and unlock the door, while other threads are waiting for this threads leaving and releasing the key.

Now let we see an example, ATM problem.

Husband, Bob, and wife, Ann, withdraw $100 each from their joint account by different ATM at same time. The thread for the first ATM deducts $100 from the couple’s account, but the thread is switched out after executing this line:

newBalance = userBalance - amount;  // amount is $100 in this scenario

userBalance = newBalance;

Processor control then switches to the thread for Ann’s ATM, which also deducting $100. When that thread deducts $100, the account balance is still $500 because the variable, userBalance, has not yet been updated. Ann’s thread executes until completing this function and updates the value of userBalance to $400. Then control switches back to Bob’s transaction.  Bob’s thread has the value $400 in newBalance. Therefore, it simply assigns this value to userBalance and returns. Thus, Bob and Ann have deducted $200 total from their account, but their balance still indicates $400, or a net $100 withdraw. This is a great feature for Bob and Ann, but a big problem for the bank.

So, here is solution by mutex:


// only thread with mutex can process
synchronized bool deposit() {}

synchronized bool withdraw() {}

OK. Since we know what mutex is, it is very natural to ask a  question that

3.  What is difference between semaphore and mutex?

Mutex is a binary semaphore, but semaphore is not mutex. Mutex is always being used as mutual exclusion, e.g. content protection, while semaphore can do more than mutex. Semaphore is a more general concept that limit numbers of threads accessing shared resources concurrently.

Let me use a metaphor from StackOverflow. You can use water to wash your dish, while I can use water to wash my dish. However, the total water is limited. Semaphore is the number of people can use water to wash dishes at same time.

4. What can we do with semaphore?

First, let me introduce syntax of semaphore. To make is simple, I use pseudo-code for semaphore in this article.

The syntax for creating  and initializing it is

fred = semaphore(1)

Semaphore operations are

fred.signal();
fred.wait();

The value of semaphore increase by 1 when signal() is called, while decreased by 1 when wait() is called. The threads will be blocked when value of semaphore is negative and wait() is called. After value of semaphore become positive, the threads will be unblocked.

Let us see the use of semaphore.

(1). Mutual exclusion, which is mentioned above.

(2). Signal. Semaphore is a way of communication between processes/threads.

Assume we have a semaphore named sem with initial value 0, and that Thread A and Thread B can access it.

//Thead A
statement a1;
sem.signal();
//Thead B
sem.wait();
statement b1;

The statement represent arbitrary program statement. To make the example concrete, we assume a1 reads a line from file and b1 displays that line on screen. The semaphore in this program guarantees that Thread A completes a1 before Thread B begins b1.
Here is how it works, if  Thread B comes first, it calls wait() that decrease value of sem to -1. Thus, Thread B is blocked until Thread A comes and completes with calling signal(), which increase value of sem. Thread B is unblock and resumes to process statement b1.

Similarly, if Thread A comes first, it will increase value of sem by 1. When Thread B comes, it will process immediately.  Either way will guarantee statement a1 precedes statement b1.

Signaling is a basic use of semaphore, more uses will be introduced later.

There is a classic multi-threading problem: Producer and Consumer can be solved by semaphore.

Q: In one common pattern, some threads are producers and some are consumers. Producers create items of some kind and add them to a data structure; consumers remove the items and process them. Event-driven programs are a good example. An “event” is something that happens that requires the program to respond: the user presses a key or moves the mouse, a block of data arrives from the disk, a packet arrives from the network, a pending operation completes. Whenever an event occurs, a producer thread creates an event object and adds it to the event buffer. Concurrently, consumer threads take events out of the buffer and process them. In this case, the consumers are called “event handlers.”

A: There are several synchronization constraints that we need to enforce to make system work correctly:

  • Accessing to buffer/content is mutually exclusive among multiple threads.
  • Consumer’s thread is blocked when buffer is empty.
  • Producer’s thread is blocked when buffer is full.

Ok, here is my code for producer/consumer problem.

// initialization
mutex = semaphore(1);
notEmpty = semaphore(0);
notFull = semaphore(buffer.size());

// code for Thread producer
notFull.wait();
mutex.wait();
producer statement;
mutex.signal();
notEmpty.signal();

//code for Thread consumer
notEmpty.wait();
mutex.wait();
consumer statement;
mutex.signal();
notFull.signal();
Design a site like this with WordPress.com
Get started