Transaction processing fundamentals
Each transaction should preserve the ACID properties for the system to reflect the correct (i.e., consistent) state of reality. Unlike a centralized computing environment where application components and resources are located at a single site, and transaction management only involves a local data manager running on a single machine, in a distributed computing environment all the resources are distributed across multiple systems. In such a case transaction management needs to be done both at local and global levels. A local transaction is one which involves activities in a single local resource manager. A distributed or a global transaction is executed across multiple systems, and its execution requires coordination between the global transaction management system and all the local data managers of all the involved systems. The Resource Manager and Transaction Manager (TM), also known as a transaction processing monitor (TP monitor), are the two primary elements of any transactional system. In centralized systems, both the TP monitor and the resource manager are integrated into the DBMS server. To support advanced functionalities required in a distributed component-based system, separation of TP monitor from the resource managers is required.
Transaction management taxonomy
The most common configurations of transactional enterprise systems are the following.
- TP Less: This configuration does not use any separate transaction management middleware. Each SQL statement is treated as a separate transaction, which results in an efficient and inexpensive approach. It suffers from the major limitations of not being usable when either flat files are involved or when updates need to grouped as a transaction
- TP Lite: This approach uses stored procedures of databases to handle updates. Since most RDBMS vendors provide some integrated TP facilities, each transaction is defined as a stored procedure. This approach works well with replication servers as the primary copy is updated by using stored procedures and is replicated in secondary copies by using a replication server.
- TP Heavy: This is the approach followed in enterprise class transactional systems, which need to interface to legacy systems. It uses a distributed transaction manager or coordinator to handle transactional updates. There are two subcategories. In the first, and the historically earliest, a separate transaction manager is used, such as CICS, Encina, Tuxedo, or TopEnd to coordinate transactions. In the second, TP monitors are built into application servers like Websphere or iPlanet Application Server or Microsoft Transaction Server.
There are two ways to specify transactions, namely, (i) programmatic and (ii) declarative.
- Programmatic: In programmatic transaction specification, a group or a sequence of operations is delineated to constitute a transaction. The most common approach is to mark the thread executing the operations for transaction processing. The transaction thus specified can be suspended by unmarking the thread and can be resumed later by explicitly propagating the transaction context from the point of suspension to the point of resumption. The commit request directs all the participating resource managers to record the effects of the operations of the transaction permanently. The rollback request makes the resource managers undo the effects of all operations in the transaction.
- Declarative: Component-based transaction processing systems, like application servers based on the Enterprise Java Beans specification or COM components hosted by Microsoft Transaction Server, support declarative transaction specification. In this approach, components are marked as transactional at deployment time. This has two implications. First, the responsibility of transaction specification is shifted from the application to the container hosting the component (hence the name, "container-managed transaction"). Secondly, the effective time of specification definition is postponed from application build time to the component deployment time.
A global or distributed transaction consists of several subtransactions and is treated as a single recoverable atomic unit. The global transaction manager is responsible for managing distributed transactions by coordinating with different resource managers to access data at several different systems. Since multiple application components and resources participate in a transaction, it's necessary for the transaction manager to establish and maintain the state of the transaction as it occurs. This is achieved by using a transaction context, which is an association between the transactional operations on the resources and the components invoking the operations. During the course of a transaction, all the threads participating in the transaction share the same transaction context. The scope of a transaction context logically encapsulates all the operations performed on transactional resources during a transaction. The transaction manager needs to analyze the transaction request and decompose the transaction into many subtransactions, propagate the transaction context, and send them to associated resource managers. The transaction context is usually maintained transparently by the underlying transaction manager.
Resource managers inform the transaction manager of their participation in a transaction by means of a process called resource enlistment. The transaction manager keeps track of all the resources participating in a transaction by resource enlistment and uses this information to coordinate transactional work performed by the resource managers with two-phase commit and recovery protocol. All the resources enlisted are deleted at the end of a transaction, i.e., after it either commits or rolls back. The transaction manager has to monitor the execution of the transaction and determine whether to commit or roll back the changes made by the transaction to ensure the atomicity of the transaction.