Blog

The problem of concurrent access to data

April 6, 2013

Imagine that your application manages a store, and that you have a model that represents the number of items in stock for each product:

product_stock.rb

class ProductStock < ActiveRecord::Base
belongs_to :product
validates :product, :presence => true
validates :stock_size,
:presence => true,
:numericality => { :only_integer => true, :greater_than_or_equal_to => 0 }
end

When a clerk sells a product item, your application will typically call a method like this on the corresponding ProductStock model:

def decrement(number_of_items = 1)
self.stock_size = stock_size - number_of_items
save
end

Now what if two clerks sell the same product at the same time? This situation might seem very unlikely but, if there are a lot of clerks and your product is a real bestseller, it can happen sooner than you think (imagine an Apple Store on the new iPhone release date). Unless you limited the server to handle only one request at a time (probably not a very practical decision), both requests will be treated by concurrent threads. A possible sequence of events is the following:

Thread 1 loads the model from the database, with a stock_size value of, say, 10.
Thread 2 loads the model from the database, also with a stock_size value of 10.
Thread 1 decreases the stock_size value by one. On its own copy of the model, this value is now 9.
Thread 2 decreases the stock_size value by one. Again, on its own copy of the model, this value is now 9.
Thread 1 saves its modified version of the model.
Thread 2 saves its modified version of the model.

What you end up with is an incorrect inventory: your application now says there are 9 items in stock, while obviously there are only 8 on the shelves. The problem is that the second thread never knew that the value was changed by the first. This is a typical example of what is known as a race condition.

Simply calling reload at the beginning of your method won’t make the problem disappear: thread 1 could still save the new value right after the model copy of thread 2 is reloaded (You can easily simulate this by calling sleep(30.seconds) just after the call to reload and play with two parallel Rails consoles). What you really need is a way to prevent outdated data from being written in the database by implementing a locking strategy. Fortunately for you, Rails makes it really easy to use the two most well-known, respectively called pessimistic locking and optimistic locking.

Pessimistic locking

The idea of pessimistic locking is to prevent more than one process to access a record in the database at the same time: when a process wants to load an object in order to modify it, it puts a lock on the corresponding record1, forcing any other process to wait for this lock to be released before they can load the record. Basically, the purpose is thus to bring atomicity to a series of operations.

In ActiveRecord, when you are inside a transaction, you can load models with the :lock => true option or call lock! on an already loaded model to put a lock on the corresponding record (if you are not inside a transaction, the lock is released as soon as it is acquired). You can also start a transaction and acquire the lock in one go by calling with_lock with a block:

def decrement(number_of_items = 1)
with_lock do
self.stock_size = stock_size - number_of_items
save
end
end

Note that placing a lock on a model will automatically force it to be reloaded.

This strategy is not without its problems, however. Indeed, a process can potentially acquire a lock and keep it for as long as it wants or even never release it, forcing all the other processes that need access to the locked record to wait indefinitely (this is called starvation). Another potential problem are deadlocks: process A locks record 1, then tries to lock record 2, but record 2 has already been locked by process B which now needs to lock record 1 to complete. Both processes are unable to complete, each waiting for the record the other has locked.

Optimistic locking

In optimistic locking, a version number is assigned to each row. When a model is updated, its version number is checked against the one in the database. If they are the same, the changes are committed and the version number of the row is incremented (within the same atomic operation); if not, that means another process has updated the row since you loaded the model, and the update fails. In this case, you need to reload the model and try again.

To enable optimistic locking in Rails, you only need to add a “lock_version” column on your table:

class AddLockVersionToInventory < ActiveRecord::Migration
def change
add_column :product_stocks, :lock_version, :integer, :null => false, :default => 0
end
end

If your record is outdated, ActiveRecord will raise a ActiveRecord::StaleObjectError; it is then your responsibility to deal with the conflict. Note also that for optimistic locking to work across all web requests, you should add lock_version as a hidden field to your form, and to the list of attr_accessible (or to the filtered set of attributes for strong_parameters).

The main drawback of optimistic locking is that it can cause a lot of updates to fail if the same record is often accessed concurrently (this could be the case in our example, actually), which can be quite tedious from the end user point of view.

References

Notes

[1] Note that some RDBMS put locks on the whole table, not on the records.

Blog

The problem of concurrent access to data

The problem of concurrent access to data

Pessimistic locking

Optimistic locking

References

Notes

Ready to build your software product? Contact us!