Replicating and synchronizing Interbase/FireBird databases using CopyCat
PART 1 : BASICS OF DATABASE REPLICATION
A replicator is a tool for keeping several databases synchronized (either wholly or in part), on a continuous basis. Such a tool can have many applications: it can allow for off-line, local data editing, with a punctual synchronization upon reconnection to main database; it can also be used over a slow connection, as an alternative to a direct connection to the central database; another use would be to make an automatic, off-site, incremental backup, by using simple one-way replication.
Creating a replicator can be quite tricky. Let’s examine some of the key design issues involved in database replication, and explain how these issues are implemented in Microtec CopyCat, a set of Delphi / C++Builder components for performing replication between Interbase and FireBird databases.
Data logging
Before anything can be replicated, all changes to each database must of course be logged. CopyCat creates a log table and triggers for each table that is to be replicated. These triggers insert into the log table all the information concerning the record that was changed (table name, primary key value(s), etc).
Multi-node replication
Replicating to and from several nodes adds another degree of complexity. Every change that is made to one database must be applied to all the others. Furthermore, when one database applies this change, it must indicate to the originating database that the change has been applied, without in any way hindering the other databases from replicating the same change, either before, simultaneously, or after.
In CopyCat, these problems are solved using a simple and flexible system. Each replication node can be have one parent node, and several sub-nodes towards which it replicates its changes. Each node’s list of sub-nodes is stored in a table in the node’s database. (Incidentally, the parent node is configured in the replicator software itself rather than in the database, and therefore, no software is needed on nodes having no parent - which allows these servers to run Linux, or any other OS supported by Interbase/FireBird).
When a data change occurs in a replicated table, one line is generated per sub-node. Thus, each sub-node fetches only the log lines that concern it.
Two-way replication
One obvious difficulty involved in two-way replication is how to avoid changes that have been replicated to one database from replicating back to the original database. Since all the changes to the database are logged, the changes made by the replicator are also logged, and will therefore bounce back and forth between the source and the target databases. How can this problem be avoided?
The solution CopyCat uses is related to the sub-node management system described above. Each sub-node is assigned a name, which is used when the sub-node logs in to the database. When a sub-node replicates its own changes to its parent, the replication triggers log the change for all the node’s sub-nodes except the current user. Thus, only sub-nodes other than the originator receive the change.
Conversely, CopyCat logs in to the nodes local database using the node name of its parent as user name. Thus, any change made to the local database during replication will be logged for all sub-nodes other than the node’s parent, and any change made to the parent node will be logged to other sub-nodes, but not to the originating node itself.
Primary key synchronization
One problem with replication is that since data is edited off-line, there is no centralized way to ensure that the value of a field remains unique. One common answer to this problem is to use GUID values. This is a good solution if you’re implementing a new database (except that GUID fields are rather large, and therefore, not very well suited for a primary key field), but if you have an existing database that needs replication, it would be very difficult to replace all primary or unique key fields by GUID values.
Since GUID fields are, in many cases, not feasible, CopyCat implements another solution. CopyCat allows you to define for each primary key field (as well as up to three other fields for which unicity is to be maintained) a synchronization method. In most cases, this will be either a generator, or a stored procedure call, though it could be any valid SQL clause. Upon replication, this SQL statement is called on the server side in order to calculate a unique key value, and the resulting value is then applied to the local database. Only after the key values (if any) have been changed locally is the record replicated to the server.
When replicating from the parent node to the local node however, this behaviour does not take place: the primary key values on the server are considered to be unique.
Conflict management
Suppose a replication node and its parent both modify the same record during the same time period. When the replicator connects to its parent to replicate its changes, it has no way of telling which of the two nodes has the most up-to-date version of the record: this is a conflict.
CopyCat automatically detects conflicts, logs them to a dedicated table, and disables replication of that record in either direction until the conflict is resolved. The conflicts table holds the user names of both nodes involved in the conflict, as well as a field called “CHOSEN_USER”. In order to solve the conflict, the user simply has to put in this field the name of the node which has the correct version of the record, and automatically, upon the next replication, the record will be replicated and the conflict resolved.
This system was carefully designed to function correctly even in some of the complex scenarios that are possible with CopyCat. For instance, the conflict may in reality be between two nodes that are not directly connected to each other: since CopyCat nodes only ever communicate directly with their parent, there is no way to tell if another node may not have a conflicting update for a certain record. Furthermore, it’s entirely possible that two nodes (having the same parent) should simultaneously attempt to replicate the same record to their parent. By using a snapshot-type transaction, and careful ordering of the replication process, these issues are handled transparently.
Difficult database structures
There are certain database architectures that are difficult to replicate. Consider for example a “STOCK” table, containing one line per product, and a field holding the current stock value. Suppose that for a certain product, the current stock value being 45, node A adds 1 item to stock, setting the stock value to 46. Simultaneously, node B, adds 2 items to stock thereby setting the current stock value to 47. How can such a table then be replicated? Neither A nor B have the correct value for the field, since neither take into consideration the changes from the other node.
Most replicators would require such an architecture to be altered. Instead of having one record hold the current stock value of product, there could be one line per change. This would solve the problem. However, restructuring large databases (and the end-user applications that usually go with them) could be a rather major task. CopyCat was specifically designed to avoid these problems altogether, rather than require the database structure to be changed.
To solve this kind of problem, CopyCat introduces stored procedure “replication”. That is, a mechanism for logging stored procedure calls, and replicating them to other nodes. When dealing with an unreplicatable table (like in the example above) one solution is to make a stored procedure which can be called for updating the table, and using stored procedure replication in order to replicate each of these calls. Thus, continuing the example above, instead of replicating the values of the STOCK table, the nodes would replicate only their changes to these values, thereby correctly synchronizing and merging the changes to the STOCK table.
PART 2 : GETTING STARTED WITH COPYCAT
1.Copycat is available in two distinct forms :
- As a set of Delphi / C++ Builder components
- As a standalone Win32 replication tool
We will now present each one of these products.
CopyCat can be obtained as a set of Delphi / C++ Builder components, enabling you to use replication features painlessly and in many different situations, and sparing you the tedious task of writing and testing custom database synchronization solutions.
Using these components, replication or synchronization facilities can be seamlessly built into existing solutions. As an example, we have used the CopyCat components in an application used by our development team on their laptops to keep track of programming tasks. When the developers need to go on-site to visit a customer, the application runs in local mode on the laptop. When they return and connect up to the company network, any changes they have made on-site are automatically synchronized with the main Interbase database running on a Linux server.
Many more applications are possible since the CopyCat components are very flexible and allow for synchronization of even a single table!
Below is a concise guide for getting started with the CopyCat component suite :
- Download and install the evaluation components from http://www.microtec.fr/copycat
-
Prepare databases
1) Open Delphi, and compile the data provider package(s) for the DAC that you plan to use (currently IBX and FIBPlus are supported). These are components for interfacing between the CopyCat components and the underlying data-access components.
2) Open and run the “Configuration” example project (requires the IBX provider).
3) On the “General” tab, fill in the connection parameters, and press “Connect to Database”.
4) On the “Users” tab, provide the list of sub-nodes for the current database.
5) On the “Tables” tab, for each table that you want to replicate, set a priority (relative to the other tables), and double-click on the “PKn generator” columns to (optionally) fill the primary key synchronization method. Once these settings have been made, set the “Created” field to ‘Y’, so as to generate the meta-data.
6) On the “Procedures” tab, set “Created” to ‘Y’ for each procedure that you want to replicate, after having set a priority.
7) Apply all the generated SQL to all databases that should be replicated.
8) For each database to be replicated, set the list of sub-nodes (in the RPL$USERS table).
-
Replicate
1. In Delphi, open the “Replicator” example project.
2. Drop a provider component on the form, and hook it up to the TCcReplicator’s DBProvider property.
3. Setup the LocalDB and RemoteDB properties of the TCcReplicator with the connection parameters for the local and remote databases.
4. Fill in the user name of the local and remote nodes, as well as the SYSDBA user name and password (needed for primary key synchronization).
5. Compile and run the example.
6. Press the “Replicate now” button.
2. CopyTiger : the CopyCat Win32 standalone replication tool
For those who have replication or synchronization needs but do not have the time or resources to develop their own integrated system, Microtec proposes a standalone database replication tool based on the CopyCat technology called COPYTIGER.
Features include :
- Easy to use installer
- Independent server Administrator tool (CTAdmin)
- Configuration wizard for setting up links to master / slave databases
- Robust replication engine based on Microtec CopyCat
- Fault-tolerant connection mechanism allowing for automatic resumption of lost database connections
- Simple & intuitive control panel
- Automatic email notification on certain events (conflicts, PK violations, etc.)
Visit the CopyTiger homepage to download a time-limited trial version.
http://www.microtec.fr/copycat/ct
SUMMARY
In today’s connected world, database replication and synchronization are topics of great interest among all industry professionals. With the advent of Microtec CopyCat, the Interbase / Firebird community is obtaining a two-fold benefit :
- By encapsulating all the functionality of a replicator into Delphi components, CopyCat makes it easier than ever to integrate replication and synchronization facilities into custom applications,
- By providing a standalone tool for the replication of Interbase / Firebird databases, Microtec is responding to another great need in the community - that of having a powerful and easy-to-use replication tool, and one that can be connected to an existing database without disrupting it’s current structure.
CopyCat is being actively developed by Microtec and many new features are being worked on such as support for replicating between heterogeneous database types (PostgreSQL, Oracle, MSSQL, MySQL, NexusDB, …) as well as a Linux / Kylix version for the components and the standalone tool.
You can find more information about CopyCat at: http://www.microtec.fr/copycat, or by contacting us at copycat@microtec.fr. As as promotional operation, Microtec is offering a 20% discount to IBDeveloper readers for the purchase of any CopyCat product, for the first 100 licenses sold! Click on the BUY NOW icon on our web site and enter the following ShareIt coupon code: ibd-852-150



