Snapshot isolation

Snapshot isolation

In databases, snapshot isolation is a isolation mode into Microsoft SQL Server 2005. It is similar to Multiversion concurrency control (MVCC). It guarantees that all reads made in a transaction will see a consistent snapshot of the database, and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot.

Multiversion concurrency control has been adopted by several major database management systems, such as SQL Anywhere, InterBase, Firebird, Oracle, PostgreSQL and Microsoft SQL Server. The main reason for its adoption is that it allows better performance than serializability, yet still avoids the kind of concurrency anomalies that cannot easily be worked around. SI has also been usedCitation | contribution=A Critique of ANSI SQL Isolation Levels
title=Proceedings of the 1995 ACM SIGMOD international Conference on Management of Data
first1=Hal | first2=Phil | first3=Jim | first4=Jim | first5=Elizabeth | first6=Patrick
last1=Berenson | last2=Bernstein | last3=Gray | last4=Melton | last5=O'Neil | last6=O'Neil
accessdate=2006-04-03 | doi=10.1145/223784.223785
year=1995 | pages=1-10
] to critique the ANSI SQL-92 standard's definition of isolation levels, as it exhibits none of the "anomalies" that the SQL standard prohibited, yet is not serializable (the anomaly-free isolation level defined by ANSI).

Snapshot isolation is similar to "serializable" mode by Oracle and PostgreSQL. There are arguments both for and against this decision; what is clear is that users must be aware of the distinction to avoid possible anomalous behaviour in their database logic.

Definition

A transaction executing under snapshot isolation performs all reads and updates as if operating on a personal "snapshot" of the database, taken at the start of the transaction. Thus, any update performed during the transaction will be seen by subsequent reads within that same transaction, but no concurrent updates will be observed. When the transaction is completed, it will commit only if its updates do not conflict with any other transaction that has committed since the snapshot was taken. Such a write-write conflict will cause the transaction to abort.

If built on multiversion concurrency control (MCC), snapshot isolation allows transactions to proceed without worrying about concurrent operations, and more importantly without needing to re-verify all read operations when the transaction finally commits. The only information that must be stored during the transaction is a list of updates made, which can be scanned for conflicts fairly easily before being committed.

As a concrete example, imagine a bank storing two balances, X and Y, for two accounts held by a single person, Phil. The bank will allow X or Y to run a deficit, provided that the total held in both is never negative (i.e., X + Y ≥ 0 must hold). Suppose both X and Y start at $100. Now imagine Phil initiates two transactions concurrently, T1 withdrawing $200 from X, and T2 withdrawing $200 from Y.

If the database guaranteed serializable transactions, the simplest way of coding T1 is to deduct $200 from X, and then verify that X + Y ≥ 0 still holds, aborting if not. T2 similarly deducts $200 from Y and then verifies X + Y ≥ 0. Since the transactions must serialize, either T1 happens first, leaving X = -$100, Y = $100, and preventing T2 from succeeding (since X + (Y - $200) is now -$200), or T2 happens first and similarly prevents T1 from committing.

Under snapshot isolation, however, both T1 and T2 can operate on private snapshots of the database: each deducts $200 from an account, and then verifies that the new total is zero, using the other account value that held when the snapshot was taken. Since neither "update" conflicts, both commit successfully, leaving X = Y = -$100, and X + Y = -$200! This non-serializable anomaly is known as "write skew".

Making Snapshot Isolation Serializable

Fekete "et al." Citation | title=Making Snapshot Isolation Serializable | journal=ACM Transactions on Database Systems
first1=Alan | first2=Dimitrios | first3=Elizabeth | first4=Patrick | first5=Dennis
last1=Fekete | last2=Liarokapis | last3=O'Neil | last4=O'Neil | last5=Shasha
doi=10.1145/1071610.1071615 | accessdate=2006-04-03
volume=30 | issue=2 | pages=492-528 | issn=0362-5915 | year=2005
] have shown that potential inconsistency problems arising from write skew anomalies can be fixed by adding (otherwise unnecessary) updates to the transactions.

* Materialize the conflict: Add a special conflict table, which both transactions update in order to create a direct write-write conflict.
* Promotion: Have one transaction "update" a read-only location (replacing a value with the same value) in order to create a direct write-write conflict (or use an equivalent promotion, e.g. Oracle's SELECT FOR UPDATE).

In the example above, we can materialize the conflict by adding a new table which makes the hidden constraint explicit, mapping each person to their "total balance". Phil would start off with a total balance of $200, and each transaction would attempt to subtract $200 from this, creating a write-write conflict that would prevent the two from succeeding concurrently. This approach violates the normal form.

Alternatively, we can promote one of the transaction's reads to a write. For instance, T2 could set X = X, creating an artificial write-write conflict with T1 and, again, preventing the two from succeeding concurrently. This solution may not always be possible.

In general, therefore, snapshot isolation puts some of the problem of maintaining non-trivial constraints onto the user, who may not appreciate either the potential pitfalls or the possible solutions. The upside to this transfer is better performance.

History

Snapshot isolation arose from work on multiversion concurrency control (MCC) databases, where multiple versions of the database are maintained concurrently to allow readers to execute without colliding with writers. Such a system allows a natural definition and implementation of such an isolation level. Borland's MCC database, InterBase 4, provided SI as far back as 1994.

Unfortunately, the ANSI SQL-92 standard was written with a lock-based database in mind, and hence is rather vague when applied to MCC systems. Berenson "et al." wrote a paper in 1995 critiquing the SQL standard, and cited snapshot isolation as an example of an isolation level that did not exhibit the standard anomalies described in the ANSI SQL-92 standard, yet still had anomalous behaviour when compared with serializable transactions.

* WRITE SKEW anomalies: Two transactions read a value each, then change the other's value, but neither sees the result of the other's update (e.g. T1 and T2 in the example above).

ANSI's "REPEATABLE READ" isolation level allows phantom reads, but prevents write skew. In contrast, snapshot isolation allows write skew, but prevents phantom reads. Serializable transactions allow neither.

References


Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • Isolation (database systems) — In database systems, isolation is a property that defines how/when the changes made by one operation become visible to other concurrent operations. Isolation is one of the ACID (Atomicity, Consistency, Isolation, Durability) properties.Isolation… …   Wikipedia

  • Snapshot (computer storage) — In computer systems, a snapshot is the state of a system at a particular point in time. The term was coined as an analogy to that in photography. It can refer to an actual copy of the state of a system or to a capability provided by certain… …   Wikipedia

  • Serializability — In concurrency control of databases,[1][2] transaction processing (transaction management), and various transactional applications (e.g., transactional memory[3] and software transactional memory), both centralized and distributed, a transaction… …   Wikipedia

  • Commitment ordering — In concurrency control of databases, transaction processing (transaction management), and related applications, Commitment ordering (or Commit ordering; CO; (Raz 1990, 1992, 1994, 2009)) is a class of interoperable Serializability techniques …   Wikipedia

  • Multiversion concurrency control — (abbreviated MCC or MVCC), in the database field of computer science, is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement… …   Wikipedia

  • NoSQL — This article is about the class of database management systems. For the specific relational database management software, see NoSQL (RDBMS). Structured storage redirects here. It is not to be confused with COM Structured Storage. In computing,… …   Wikipedia

  • InterBase — Infobox Software name = InterBase caption = InterBase s performance monitor allows database administrators to view and control server use. developer = CodeGear latest release version = 2009 latest release date = 08 September, 2008 operating… …   Wikipedia

  • ACID — For other uses, see Acid (disambiguation). In computer science, ACID (atomicity, consistency, isolation, durability) is a set of properties that guarantee database transactions are processed reliably. In the context of databases, a single logical …   Wikipedia

  • PostgreSQL — Developer(s) PostgreSQL Global Development Group Stable release 9.1.1[1] / 9.0.5 …   Wikipedia

  • Concurrency control — In information technology and computer science, especially in the fields of computer programming (see also concurrent programming, parallel programming), operating systems (see also parallel computing), multiprocessors, and databases, concurrency …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”