SQL 2005+

SQL Server Transaction Isolation Levels

by Richard Carr, published at http://www.blackwasp.co.uk/SQLIsolationLevels.aspx

Permitting concurrent transactions in database management systems is essential for performance but introduces the possibility of reading inconsistent data. To balance concurrency problems and performance, SQL Server has four transaction isolation levels.

Serializability

It is very rare for developers working with SQL Server to use databases that are only ever accessed by a single user who never needs to run more than one operation at the a time. The majority of SQL Server databases will have multiple users and those users may execute several transactions simultaneously. This presents the possibility of concurrency problems. What should happen if two or more transactions access the same data at the same time?

The answer to concurrency problems is provided by locking. If one ongoing transaction makes changes to a set of data, this data may be locked. Whilst locked other transactions may be prevented from seeing the updated information. They may be blocked, meaning that they stop executing until the locks are released. Locking in this manner means that transactions can safely read and update data without impacting each other. However, the introduction of blocking can lower the potential concurrency level, slow the overall performance of the database and introduce the possibility of deadlocks, where several transactions block each other and, unless terminated, will remain blocked indefinitely.

The ideal for ensuring consistency of information is to achieve serializability. This means that if a set of transactions is executed concurrently, the final outcome is exactly the same as if they were run sequentially without overlap. An overly simplistic method of achieving serializability would be to only permit one transaction to run at any one time. However, this amounts to queuing the transactions and would ensure very poor performance and concurrency support. Another option would be to lock the data that a transaction uses, allowing transactions that work with different rows or tables to execute in parallel without the risk of impacting serializability. Again, this is too simplistic for a highly concurrent database management system (DBMS), such as SQL Server.

Concurrency Problems

SQL Server's model for dealing with concurrency issues allows you to create a balance between serializability, performance and concurrency. In some cases it is essential that no concurrency problems can be introduced to a transaction. In other situations some problems may be acceptable in order to achieve the speed of execution and the number of active processes that you need. You specify which data inconsistencies you are willing to accept by setting a transaction's isolation level. You may be surprised to know that SQL Server's default isolation level allows many concurrency problems to occur.

We'll see how the transaction isolation levels work a little later on. First, let's consider the main concurrency problems that you can encounter.

Dirty Reads

A dirty read happens when one transaction is permitted to see information that has been updated or inserted by a second transaction that has not yet been committed. In many cases this would not cause a problem. However, if the second transaction is rolled back after the first reads the data, the first transaction can act upon information that does not exist. Consider the following scenario, which shows a series of steps executed by two transactions. Lower rows in the table occur at a later time than those that appear higher.

Transaction 1	Transaction 2
Inserts a new row for "Tim" into the "People" table.
	Reads all rows from the "People" table, including the row for "Tim".
Rolls back the transaction, deleting "Tim" from the table.
	Inserts references to all read rows into a table with a foreign key relationship with "People". The insert for "Tim" fails because the row does not exist.

Non-Repeatable Reads

If a database was truly serializable, a transaction that executed the same query twice, without updating the data in the meantime, would expect to see the same results for each of the queries. When running concurrent transactions, you can encounter a problem called an unrepeatable read. This is where the DBMS allows one transaction to run a query and a second transaction to modify the data that the first retrieved. When the first transaction executes the same query for a second time, the results are different.

Transaction 1	Transaction 2
Reads the data from the "People" table, including the "Lisa Green" row.
	Updates the row for "Lisa Green", setting her married name of "Lisa Brown".
Reads the data from the "People" table, including the "Lisa Brown" row. This is different data to that which was seen in the first query.

Phantom Reads

Phantom reads are subtly different from non-repeatable reads. A non-repeatable read occurs when one or more of the rows from the first query have been updated. A phantom read occurs when a query executed twice has a different set of resultant rows. This can happen because a second transaction inserts a new row that matches the WHERE clause of the query executed by the first transaction. This distinction is important because of the way in which the concurrency problem can be counteracted. To prevent a non-repeatable read, the DBMS must lock all of the rows that were read. To thwart phantom reads, the DBMS must not permit the creation of new information. Depending upon the strategy employed, it is possible to encounter phantom reads whilst non-repeatable reads are prevented.

Transaction 1	Transaction 2
Reads the data from the "People" table.
	Inserts a new row for "Tim" into the "People" table.
Reads the data from the "People" table, including the "Tim" row. This is different data to that which was seen in the first query.

Next: Page 2

5 August 2012