Non-blocking synchronization

Non-blocking synchronization

In computer science, non-blocking synchronization ensures that threads competing for a shared resource do not have their execution indefinitely postponed by mutual exclusion. Literature up to the turn of the century used "non-blocking" synonymously with "lock-free": an algorithm with guaranteed system-wide progress. However, since 2003,M. Herlihy, V. Luchangco and M. Moir. [http://www.cs.brown.edu/people/mph/HerlihyLM03/main.pdf "Obstruction-Free Synchronization: Double-Ended Queues as an Example."] 23rd International Conference on Distributed Computing Systems, 2003, p.522.] the term has been weakened to only prevent progress-blocking interactions with a preemptive scheduler.

In modern usage, therefore, an algorithm is "non-blocking" if the suspension of one or more threads will not stop the potential progress of the remaining threads. They are designed to avoid requiring a critical section. Often, these algorithms allow multiple processes to make progress on a problem without ever blocking each other. For some operations, these algorithms provide an alternative to locking mechanisms.

Motivation

The traditional approach to multi-threaded programming is to use locks to synchronize access to shared resources. Synchronization primitives such as mutexes, semaphores, and critical sections are all mechanisms by which a programmer can ensure that certain sections of code do not execute concurrently if doing so would corrupt shared memory structures. If one thread attempts to acquire a lock that is already held by another thread, the thread will block until the lock is free.

Blocking a thread, though, is undesirable for many reasons. An obvious reason is that while the thread is blocked, it cannot accomplish anything. If the blocked thread is performing a high-priority or real-time task, it is highly undesirable to halt its progress. Other problems are less obvious. Certain interactions between locks can lead to error conditions such as deadlock, livelock, and priority inversion. Using locks also involves a trade-off between coarse-grained locking, which can significantly reduce opportunities for parallelism, and fine-grained locking, which requires more careful design, increases overhead and is more prone to bugs.

Non-blocking algorithms are also safe for use in interrupt handlers: even though the preempted thread cannot be resumed, progress is still possible without it. In contrast, global data structures protected by mutual exclusion cannot safely be accessed in a handler, as the preempted thread may be the one holding the lock.

Non-blocking synchronization has the potential to prevent priority inversion, as no thread is forced to wait for a suspended thread to complete. However, as livelock is still possible in the modern definition, threads have to wait when they encounter contention; hence, priority inversion is still possible depending upon the contention management system used. Lock-free algorithms, below, avoid priority inversion.

Implementation

Non-blocking algorithms use atomic read-modify-write primitives that the hardware must provide, the most notable of which is compare and swap (CAS). Ultimately, all synchronizing algorithms must use these; however, critical sections are almost always implemented using standard interfaces over these primitives. Until recently, all non-blocking algorithms had to be written "natively" with the underlying primitives to achieve acceptable performance. However, the emerging field of software transactional memory promises standard abstractions for writing efficient non-blocking code.

Much research has also been done in providing basic data structures such as stacks, queues, sets, and hash tables. These allow programs to easily exchange data between threads asynchronously.

Wait-freedom

Wait-freedom is the strongest non-blocking guarantee of progress, combining guaranteed system-wide throughput with starvation-freedom. An algorithm is wait-free if every operation has a bound on the number of steps it will take before completing.

It was shown in the 1980sMaurice P. Herlihy. [http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=62593 "Impossibility and universality results for wait-free synchronization"] Proceedings of the seventh annual ACM Symposium on Principles of distributed computing, 1988, pp. 276 - 290.] that all algorithms can be implemented wait-free, and many transformations from serial code, called "universal constructions", have been demonstrated. However, the resulting performance does not in general match even naive blocking designs. It has also been shownF. Fich, D. Hendler, N. Shavit. [http://www.cs.tau.ac.il/~afek/Handler-conditionals.pdf "On the inherent weakness of conditional synchronization primitives."] 23rd Annual ACM Symposium on Principles of Distributed Computing, 2004, pp. 80-87.] that the widely-available atomic "conditional" primitives, compare-and-swap and LL/SC, cannot provide starvation-free implementations of many common data structures without memory costs growing linearly in the number of threads. Wait-free algorithms are therefore rare, both in research and in practice.

Lock-freedom

Lock-freedom allows individual threads to starve but guarantees system-wide throughput. An algorithm is lock-free if every step taken achieves global progress (for some sensible definition of progress). All wait-free algorithms are lock-free.

In general, a lock-free algorithm can run in four phases: completing one's own operation, assisting an obstructing operation, aborting an obstructing operation, and waiting. Completing one's own operation is complicated by the possibility of concurrent assistance and abortion, but is invariably the fastest path to completion.

The decision about when to assist, abort or wait when an obstruction is met is the responsibility of a "contention manager". This may be very simple (assist higher priority operations, abort lower priority ones), or may be more optimized to achieve better throughput, or lower the latency of prioritized operations.

Correct concurrent assistance is typically the most complex part of a lock-free algorithm, and often very costly to execute: not only does the assisting thread slow down, but thanks to the mechanics of shared memory, the thread being assisted will be slowed, too, if it is still running.

Obstruction-freedom

Obstruction-freedom is possibly the weakest natural non-blocking progress guarantee. An algorithm is obstruction-free if at any point, a single thread executed in isolation (i.e. with all obstructing threads suspended) for a bounded number of steps will complete its operation. All lock-free algorithms are obstruction-free.

Obstruction-freedom demands only that any partially-completed operation can be aborted and the changes made rolled back. Dropping concurrent assistance can often result in much simpler algorithms that are easier to validate. Preventing the system from continually live-locking is the task of a contention manager.

Obstruction-freedom is also called optimistic concurrency control.

Some obstruction-free algorithms use a pair of "consistency markers" in the data structure.Processes reading the data structure first read one consistency marker, then read the relevant data into an internal buffer, then read the other marker, and then compare the markers.When comparing the two markers, occasionally a process will notice they are different, indicating "inconsistent data".That happens when the read was interrupted by some other process updating the data structure.In that case the process discards the data in the internal buffer and tries again.When comparing the two markers, typically both markers will be identical, indicating that the data is consistent.

Recent researchW. Scherer and M. Scott. [http://www.cs.rochester.edu/u/scott/papers/2005_PODC_CM.pdf "Advanced Contention Management for Dynamic Software Transactional Memory."] 24th annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, 2005, pp. 240-248.] has yielded a promising practical contention manager, whimsically named "Polka", combining exponential backoff with "priority accumulation". As an operation progresses, it gains "priority"; when an operation is obstructed by another with higher priority, it will back off, with backoff intervals increasing exponentially. Each backoff increases the operation's priority; only when its priority is greater than that of its obstructor will it abort it. Aborted operations retain their former priority, giving their next attempt a greater chance of success.

Polka achieves good throughput in benchmarks because it minimizes both wasted effort, by prioritizing long transactions, and memory interconnect contention, using exponential backoff. This can inform other parallel algorithms, such as lock-free ones, to achieve greater throughput in the common case.

ee also

* Concurrency control
* Linearizability

References

External links

*Article " [http://www.research.ibm.com/people/m/michael/podc-1996.pdf Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms] " by Maged M. Michael and Michael L. Scott
*Discussion " [http://groups.google.com/groups?group=comp.programming.threads&threadm=c2s1qn%24mrj%247%40newsserv.zdv.uni-tuebingen.de Communication between Threads, without blocking] "


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • Non-blocking — may refer to: Non blocking I/O Non blocking synchronization This disambiguation page lists articles associated with the same title. If an internal link led you here, you may wish to change the link to point directly to the int …   Wikipedia

  • Non-blocking algorithm — In computer science, a non blocking algorithm ensures that threads competing for a shared resource do not have their execution indefinitely postponed by mutual exclusion. A non blocking algorithm is lock free if there is guaranteed system wide… …   Wikipedia

  • Synchronization (computer science) — In computer science, synchronization refers to one of two distinct but related concepts: synchronization of processes, and synchronization of data. Process synchronization refers to the idea that multiple processes are to join up or handshake at… …   Wikipedia

  • Blocking (scheduling) — In a multitasking computer system, individual tasks, or threads of execution, must share the resources of the system. These resources might be:* the CPU * network * memory * diskor any other shared resource.When one task is using a resource, it… …   Wikipedia

  • Non-Uniform Memory Access — (NUMA) is a computer memory design used in Multiprocessing, where the memory access time depends on the memory location relative to a processor. Under NUMA, a processor can access its own local memory faster than non local memory, that is, memory …   Wikipedia

  • Lock-free and wait-free algorithms — In contrast to algorithms that protect access to shared data with locks, lock free and wait free algorithms are specially designed to allow multiple threads to read and write shared data concurrently without corrupting it. Lock free refers to the …   Wikipedia

  • Read-copy-update — (RCU) is an operating system kernel technology for improving performance on computers with more than one CPU.More technically it is a synchronization mechanism which can sometimes be used as an alternative to a readers writer lock. It allows… …   Wikipedia

  • Lock (computer science) — In computer science, a lock is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. Locks are one way of enforcing concurrency control policies. Contents 1 Types 2… …   Wikipedia

  • Type Stable Memory Management — refers to the concept of maintaining type between memory allocation and deallocation. This idea is extremely useful in constructing non blocking synchronization algorithms. The below image describes an algorithm detailing the use of type stable… …   Wikipedia

  • Embedded system — Picture of the internals of an ADSL modem/router. A modern example of an embedded system. Labelled parts include a microprocessor (4), RAM (6), and flash memory (7). An embedded system is a computer system designed to do one or a few dedicated… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”