Locking, Firebird, and the Lock Table

Ann. W. Harrison, aharrison@ibphoenix.com

One of the best-known facts about Firebird is that it uses multi-version concurrency control instead of record locking. For databases that manage concurrency though record locks, understanding mechanisms of record locking is critical to designing a high performance, high concurrency application. In general, record locking is irrelevant to Firebird. However, Firebird uses locks to maintain internal consistency and for interprocess coordination. Understanding how Firebird does (and does not) use locks helps with system tuning and anticipating how Firebird will behave under load.

I’m going to start by describing locking abstractly, then Firebird locking more specifically, then the controls you have over the Firebird lock table and how they can affect your database performance. So, if you just want to learn about tuning, skip to the end of the article.

Concurrency control

Concurrency control in database systems has three basic functions: preventing concurrent transactions from overwriting each others’ changes, preventing readers from seeing uncommitted changes, and giving a consistent view of the database to running transactions. Those three functions can be implemented in various ways. The simplest is to serialize transactions – allowing each transaction exclusive access to the database until it finishes. That solution is neither interesting nor desirable

A common solution is to allow each transaction to lock the data it uses, keeping other transactions from reading data it changes and from changing data it reads. Modern databases generally lock records. Firebird provides concurrency control without record locks by keeping multiple versions of records, each marked with the identifier of the transaction that created it. Concurrent transactions don’t overwrite each other’s changes, because the system will not allow a transaction to change a record if the most recent version was created by a concurrent transaction. Readers never see uncommitted data because the system will not return record versions created by concurrent transactions. Readers see a consistent version of the database because the system returns only the data committed when they start – allowing other transactions to create newer versions. Readers don’t block writers.

Locks in database systems

A lock sounds like a very solid object, but in database systems, a lock anything but solid. “Locking” is a shorthand description of a protocol for reserving resources. Databases use locks to maintain consistency while allowing concurrent independent transactions to update distinct parts of the database. Each transaction reserves the resources – tables, records, data pages – that it needs to do its work. Typically, the reservation is made in memory to control a resource on disk, so the cost of reserving the resource is not significant compared with the cost of reading or changing the resource.

Locking example

“Resource” is a very abstract term. Lets start by talking about locking tables. Firebird does lock tables, but normally it locks them only to prevent catastrophes like having one transaction drop a table that another transaction is using.

Before a Firebird transaction can access a table, it must get a lock on the table. The lock prevents other transactions from dropping the table while it is in use. When a transaction gets a lock on a table, Firebird makes an entry in its table of locks, indicating the identity of the transaction that has the lock, the identity of the table being locked, and a function to call if another transaction wants an incompatible lock on the table. The normal table lock is shared – other transactions can lock the same table in the same way.

Before a transaction can delete a table, it must get an exclusive lock on the table. An exclusive lock is incompatible with other locks. If any transaction has a lock on the table, the request for an exclusive lock is denied, the drop table statement fails. Returning an immediate error is one way of dealing with conflicting lock requests. The other is to put the conflicting request on a list of unfulfilled requests and let it wait for the resource to become available

In its simplest form, that is how locking works. All transactions follow the formal protocol of requesting a lock on a resource before using it. Firebird maintains a list of locked resources, a list of requests for locks on resources – satisfied or waiting – and a list of the owners of lock requests. When a transaction requests a lock that is incompatible with existing locks on a resource, Firebird either denies the new request, or puts it on a list to wait until the resource is available. Internal lock requests specify whether they wait or receive an immediate error on a case-by-case basis. When a transaction starts, it specifies whether it will wait for locks that it acquires on tables, etc.

Lock modes

For concurrency and read committed transactions, Firebird locks tables for shared read or shared write. Either mode says, “I’m using this table, but you are free to use it too.” Consistency mode transactions follow different rules. They lock tables for protected read or protected write. Those modes say “I’m using the table and no one else is allowed to change it until I’m done.” Protected read is compatible with shared read and other protected read transactions. Protected write is only compatible with share read.

The important concept about lock modes is that locks are more subtle than mutexes – locks allow resource sharing, as well as protecting resources from incompatible use.

Two-phase locking vs. transient locking

The table locks that we’ve been describing follow a protocol known as two-phase locking, which is typical of locks taken by transactions in database systems. Databases that use record locking for consistency control always use two-phase record locks. In two-phase locking, a transaction acquires locks as it proceeds and holds the locks until it ends. Once it releases any lock, it can no longer acquire another. The two phases are lock acquisition and lock release. They cannot overlap.

When a Firebird transaction reads a table, it holds a lock on that table until it ends. When a concurrency transaction has acquired a shared write lock to update a table, no consistency mode transaction will be able to get a protected lock on that table until the transaction with the shared write lock ends and releases its locks. Table locking in Firebird is two-phase locking.

Locks can also be transient, taken and released as necessary during the running of a transaction. Firebird uses transient locking extensively to manage physical access to the database.

Firebird page locks

One major difference between Firebird and most other databases is Firebird’s Classic mode. In Classic mode, many separate processes share write access to a single database file. Most databases have a single server process like SuperServer that has exclusive access to the database and coordinates physical access to the file within itself. Firebird coordinates physical access to the database through locks on database pages.

In general database theory, a transaction is a set of steps that transform the database from on consistent state to another. During that transformation, the resources held by the transaction must be protected from incompatible changes by other transactions. Two-phase locks are that protection.

In Firebird, internally, each time a transaction changes a page, it changes that page – and the physical structure of the database as a whole – from one consistent state to another. Before a transaction reads or writes a database page, it locks the page. When it finishes reading or writing, it can release the lock without compromising the physical consistency of the database file. Firebird page level locking is transient. Transactions acquire and release page locks throughout their existence. However, to prevent deadlocks, transactions must be able to release all the page locks it holds before acquiring a lock on a new page.

The Firebird lock table

When all access to a database is done in a single process – as is the case with most database systems – locks are held in the server’s memory and the lock table is largely invisible. The server process extends or remaps the lock information as required. Firebird, however, manages its locks in a shared memory section. In SuperServer, only the server uses that share memory area. In Classic, every database connection maps the shared memory and every connection can read and change the contents of the memory.

The lock table is a separate piece of share memory. In SuperServer, the lock table is mapped into the server process. In Classic, each process maps the lock table. All databases on a server computer share the same lock table, except those running with the embedded server.

The Firebird lock manager

We often talk about the Firebird Lock Manager as if it were a separate process, but it isn’t. The lock management code is part of the engine, just like the optimizer, parser, and expression evaluator. There is a formal interface to the lock management code, which is similar to the formal interface to the distributed lock manager that was part of VAX/VMS and one of the interfaces to the Distributed Lock Manager from IBM.

The lock manager is code in the engine. In Classic, each process has its own lock manager. When a Classic process requests or releases a lock, its lock management code acquires a mutex on the shared memory section and changes the state of the lock table to reflect its request.

Conflicting lock requests

When a request is made for a lock on a resource that is already locked in an incompatible mode, one of two things happens. Either the requesting transaction gets an immediate error, or the request is put on a list of waiting requests and the transactions that hold conflicting locks on the resource are notified of the conflicting request. Part of every lock request is the address of a routine to call when the lock interferes with another request for a lock on the same object. Depending on the resource, the routine may cause the lock to be released or require the new request to wait.

Transient locks like the locks on database pages are released immediately. When a transaction requests a page lock and that page is already locked in an incompatible mode, the transaction or transactions that hold the lock are notified and must complete what they are doing and release their locks immediately. Two-phase locks like table locks are held until the transaction that owns the lock completes. When the conflicting lock is released, and the new lock is granted, then transaction that had been waiting can proceed.

Locks as interprocess communication

Lock management requires a high speed, completely reliable communication mechanism between transactions, including transactions in different processes. The actual mechanism varies from platform to platform, but for the database to work the mechanism must be fast and reliable. A fast, reliable interprocess communication mechanism can be – and is – useful for a number of purposes outside the area that’s normally considered database locking.

For example, Firebird uses the lock table to notify running transactions of the existence of a new index on a table. That’s important, since as soon as an index becomes active, every transaction must help maintain it – making new entries when it stores or modifies data, removing entries when it modifies or deletes data.

When a transaction first references a table, it gets a lock on the existence of indexes for the table. When another transaction wants to create a new index on that table, it must get an exclusive lock on the existence of indexes for the table. Its request conflicts with existing locks, and the owners of those locks are notified of the conflict. When those transactions are in a state where they can accept a new index, they release their locks, and immediately request new shared locks on the existence of indexes for the table. The transaction that wants to create the index gets its exclusive lock, creates the index, and commits, releasing its exclusive lock on the existence of indexing. As other transactions get their new locks, they check the index definitions for the table, find the new index definition, and begin maintaining the index.

Firebird locking summary

Although Firebird does not lock records, it uses locks extensively to isolate the effects of concurrent transactions. Locking and the lock table are more visible in Firebird than in other databases because the lock table is a central communication channel between the separate processes that access the database in Classic mode. In addition to controlling access to database objects like tables and data pages, the Firebird lock manager allows different transactions and processes to notify each other of changes to the state of the database, new indexes, etc.

Lock table specifics

The Firebird lock table is an in-memory data area that contains of four primary types of blocks. The lock header block describes the lock table as a whole and contains pointers to lists of other blocks and free blocks. Owner blocks describe the owners of lock requests – generally lock owners are transactions, connections, or the SuperServer. Request blocks describe the relationship between an owner and a lockable resource – whether the request is granted or pending, the mode of the request, etc. Lock blocks describe the resources being locked.

To request a lock, the owner finds the lock block, follows the linked list of requests for that lock, and adds its request at the end. If other owners must be notified of a conflicting request, they are located through the request blocks already in the list. Each owner block also has a list of its own requests. The performance critical part of locking is finding lock blocks. For that purpose, the lock table includes a hash table for access to lock blocks based on the name of the resource being locked.

A quick refresher on hashing

A hash table is an array with linked lists of duplicates and collisions hanging from it. The names of lockable objects are transformed by a function called the hash function into the offset of one of the elements of the array. When two names transform to the same offset, the result is a collision. When two locks have the same name, they are duplicates and always collide.

In the Firebird lock table, the array of the hash table contains the address of a hash block. Hash blocks contain the original name, a collision pointer, a duplicate pointer, and the address of the lock block that corresponds to the name. The collision pointer contains the address of a hash block whose name hashed to the same value. The duplicate pointer contains the address of a hash block that has exactly the same name.

A hash table is fast when there are relatively few collisions. With no collisions, finding a lock block involved hashing the name, indexing into the array, and reading the pointer from the first hash block. Each collision adds another pointer to follow and name to check. The ratio of the size of the array to the number of locks determines the number of collisions. Unfortunately, the width of the array cannot be adjusted dynamically because the size of the array is part of the hash function. Changing the width changes the result of the function and invalidates all existing entries in the hash table.

Adjusting the lock table to improve performance

The size of the hash table is set in the Firebird configuration file. You must shut down all activity on all databases that share the hash table – normally all databases on the machine – before changes take effect. The Classic architecture uses the lock table more heavily than SuperServer. If you choose the Classic architecture, you should check the load on the hash table periodically and increase the number of hash slots if the load is high. The symptom of an overloaded hash table is sluggish performance under load.

The tool for checking the lock table is fb_lock_print, which is a command line utility in the bin directory of the Firebird installation tree. The full lock print describes the entire state of the lock table and is of limited interest. When your system is under load and behaving badly, invoke the utility with no options or switches, directing the output to a file. Open the file with an editor. You’ll see output that starts something like this:

LOCK_HEADER BLOCK 
    Version: 114, Active owner:      0, Length: 262144, Used:  85740 
    Semmask: 0x0, Flags: 0x0001 
    Enqs:  18512, Converts:    490, Rejects:      0, Blocks:      0 
    Deadlock scans:      0, Deadlocks:      0, Scan interval:  10 
    Acquires:  21048, Acquire blocks:      0, Spin count:   0 
    Mutex wait: 10.3% 
    Hash slots:  101, Hash lengths (min/avg/max):    3/   15/   30 

The seventh and eighth lines suggest that the hash table is too small and that it is affecting system performance. In the example, these values indicate a problem:

Mutex wait: 10.3% 
    Hash slots:  101, Hash lengths (min/avg/max):    3/   15/   30 

In the Classic architecture, each process makes its own changes to the lock table. Only one process is allowed to update the lock table at any instant. When updating the lock table, a process holds the table’s mutex. A non-zero mutex wait indicates that processes are blocked by the mutex and forced to wait for access to the lock table. In turn, that indicates a performance problem inside the lock table, typically because looking up a lock is slow.

If the hash lengths are more than min 5, avg 10, or max 30, you need to increase the number of hash slots. The hash function used in Firebird is quick but not terribly efficient. It works best if the number of hash slots is prime.

Change this line in the configuration file:

#LockHashSlots = 101 

Uncomment the line by removing the leading #. Choose a value that is a prime number less than 2048.

LockHashSlots = 499 

The change will not take effect until all connections to all databases on the server machine shut down.

If you increase the number of hash slots, you should also increase the lock table size. The second line of the lock print

Version: 114, Active owner:      0, Length: 262144, Used:  85740 

tells you how close you are to running out of space in the lock table. The Version and Active owner are uninteresting. The length is the maximum size of the lock table. Used is the amount of space currently allocated for the various block types and hash table. If the amount used is anywhere near the total length, uncomment this parameter in the configuration file by removing the leading #, and increase the value

   #LockMemSize = 262144 

to this

LockMemSize = 1048576 

The value is bytes. The default lock table is about a quarter of a megabyte, which is insignificant on modern computers. Changing the lock table size will not take effect until all connections to all databases on the server machine shut down.

Leave a Reply

You must be logged in to post a comment.