Multi-User Systems and Concurrency

by Richard Carr, published at http://www.blackwasp.co.uk/OptimisticPessimistic.aspx

In a single-user application information can be read, created, updated or deleted without the risk of interfering with other people's work. When more than one user has the ability to access shared information, concurrency control becomes essential.

Concurrency Issues

When developing multi-user systems it is important to consider the desired outcome when two or more users or services simultaneously try to access the same data. The most serious problems occur when two users try to modify an item at the same the same time or when one user is editing information whilst others read the original data.

As an example, consider a database that contains all of a company's customer information. If two users simultaneously open the same customer record, make changes to the information and then save their modifications, several data integrity issues are possible. We can see these by looking at the timeline of the two concurrent transactions:

User A	User B	Problem	Stored Customer Data
Opens the customer record.			"Janet Smith" "Silver Level"
Changes the customer's surname to recognise their marriage. The new surname is "White".
	Opens the customer record.	User B sees the name "Janet Smith", not "Janet White".
	Changes the customer level to "Gold".	User A is unaware of the change to Gold status. At this point, both users have the customer information on screen but both sets of visible details are incorrect.
Saves the customer.		User B is unaware that the stored data in the customer database has been changed.	"Janet White" "Silver Level"
	Saves the Customer.	User B's action has caused the revised surname to be overwritten. Both users are unaware of the lost information.	"Janet Smith" "Gold Level"

Concurrency Control

The problems described above are some of those that must be prevented to maintain the integrity of a system's data in a multi-user system. These issues are generated because it is necessary to make temporary, local copies of information from the official data store; in this case the customer database. These local copies are essential to permit data to be modified but cause more up-to-date information to exist outside of the official data than within.

A concurrency control system provides safe access to shared information, removing the risk of collisions that cause data integrity problems or data loss such as that described above. This is usually achieved by either preventative locking, where access to data being edited by another user is limited or removed, or by the detection of collisions, where problematic changes are disallowed or require correction.

Locking Strategies

In this article we will discuss two types of concurrency control system. These are optimistic locking and pessimistic locking. This will be the first article in a series of three. The two following articles will provide working examples of the locking strategies using SQL Server tables and stored procedures.

Optimistic Locking

Optimistic locking is so named because it optimistically assumes that there will be few collisions when data is modified. This strategy does not prevent users from accessing information that is in the process of being updated by other users or processes. However, it does prevent destructive changes from being made without the user's knowledge by detecting changes made by another user that have occurred during the time of editing. The process for modifying data with an optimistic concurrency control system is as follows:

The user reads the original information from the data store.
The user makes changes to their local copy. The original information is remembered.
The user attempts to save their changes. Before the change is committed, the existing official data is compared with the originally loaded information. If the two match, the save is permitted. If the data has already been modified, the change is rejected.

In step 2, it is common to remember a signature for the originally retrieved information, rather than all of the data. This is often a version number that is updated with every revision. If the version number changes between the times of loading and saving, this indicates that a collision has occurred.

When a collision is detected, there are several possible courses of action. The simplest is to reject the new updates entirely and require the user to repeat the process. More complex solutions allow the new data to be saved but audit the collision and the data that existed before and after the change. The best solutions can merge the new data with the existing information, either automatically or with the user's assistance.

Optimistic locking is most suited to systems where collisions are unlikely because editing of data is limited, or where information is partitioned in such a way that each user usually edits a distinct set of items. It is the only viable solution that we will consider for disconnected systems, which rely on occasional synchronisation.

Advantages of Optimistic Locking

The key advantages of optimistic locking systems are:

The risk of inadvertently overwriting other users' changes is removed.
Optimistic locking can be used on rarely connected systems, as no locking information needs to be held centrally.
There is little central overhead to the locking system. This leads to only a small performance impact whilst allowing higher concurrency and scalability.

Disadvantages of Optimistic Locking

The key disadvantages of optimistic locking systems are:

The possibility exists that users will be disappointed when they attempt to save their changes. In the worst situations, a user may be informed that the official data changed whilst they were editing a record and that their changes were cancelled. This increases the user's workload and decreases satisfaction with the software.
As the duration required to edit an item increases, the possibility that a collision will occur also increases. This can occur when individual items are complex and take a long time to update, or for disconnected systems where local changes are committed rarely.
No details of data being edited by one user are available to the other users. This prevents users from delaying changes until the information they require becomes available for editing and increases the chance of rejections.

Next: Page 2

11 March 2010