Thread-safety is a computer programming concept applicable to multi-threaded programs. A piece of code is thread-safe if it is reentrant or protected from multiple simultaneous execution by some form of mutual exclusion.
Thread-safety is a key challenge in multi-threaded programming. It was once only a concern of the operating system programmer but has of late become a commonplace issue to be tackled by the everyday programmer. In a multi-threaded program, several threads execute simultaneously in a shared address space. Every thread has access to virtually all the memory of every other thread. Thus the flow of control and the sequence of accesses to data often have little relation to what would be reasonably expected by looking at the text of the program. This violates the principle of least astonishment. Thread-safety is a property aimed for so as to minimize surprising behaviour by re-establishing some of the correspondences between the actual flow of control and the text of the program.
The requirement for thread-safety highlights the inherent tension in multi-threaded programming: the need for multiple threads to access the same shared data, and the need for a shared piece of data to be accessed by only one thread at any given time.
It is not easy to determine if a piece of code is thread-safe or not. However, there are several indicators that suggest the need for careful examination to see if it is unsafe:
- Accessing global variables or the heap.
- Allocating/freeing resources that have global limits (files, sub-processes, etc.)
- Indirect accesses through handles or pointers
A subroutine that only uses variables from the stack, depends only on the arguments passed in, and calls other subroutines with similar properties is reentrant, and thus thread-safe.
As seen in the definition, there are a few ways to achieve thread-safety:
- Reentrancy: Basically, writing code in such a way as to avoid sharing of data across threads.
- Mutual exclusion: Access to shared data is serialized using mechanisms that ensure only one thread is accessing the shared data at any time. If a piece of code accesses multiple shared pieces of data, there needs to be an enormous amount of care in using mutual exclusion mechanisms -- problems include race conditions, deadlocks, livelocks, starvation, and various other ills enumerated in many operating systems textbooks.
A commonly used idiom combines both approaches:
- Make changes to a private copy of the data, and finally, atomically update the shared data from the private copy. Thus, most of the code would be close to re-entrant, and the amount of time spent serialized would be small.
The concept of exception safety is closely related, since it again deals with (synchronous) flows of control not directly correlated to the text of a program.