26.6 Long-Duration Transactions

Introduction

Traditional transaction concepts (as discussed in Chapters 14, 15, and 16) primarily focused on non-interactive and short-duration transactions. These approaches face challenges when applied to systems involving human interaction, where transactions become significantly longer and exhibit different properties.

Properties of Long-Duration Transactions

Long-duration transactions, often arising from human interaction, have distinct characteristics that make standard transaction management techniques problematic:

Long Duration: Due to human response times (which are slow relative to computer speeds) and the nature of design/interactive tasks (which can span hours, days, or even longer), these transactions extend over significant periods, both from a human and a machine perspective.
Exposure of Uncommitted Data: Data generated or displayed during a long-duration transaction is, by definition, uncommitted because the transaction might abort. This means users (and potentially other transactions) may interact with data that is not yet finalized. Cooperating users might even need to exchange this uncommitted data.
Subtasks: Interactive transactions often involve multiple subtasks initiated by the user. It’s crucial to allow users to abort individual subtasks without necessarily aborting the entire transaction.
Recoverability: Aborting a long-duration transaction due to a system crash is highly undesirable, as it would lead to significant loss of human work. Recovery should restore the transaction to a state very close to the point of failure, minimizing lost work.
Performance (Response Time): In interactive systems, response time is the primary performance metric (as opposed to throughput in non-interactive systems). Fast and predictable response times are critical for user efficiency and satisfaction.

26.6.1 Nonserializable Executions

Enforcing strict serializability (as in Chapter 15) poses significant problems for long-duration transactions. The standard concurrency control protocols have detrimental effects:

Problems with Two-Phase Locking (2PL):
- Long Waits: If a lock cannot be immediately granted, the requesting transaction must wait until the lock is released. If a long-duration transaction holds the lock, this wait will also be long, increasing response time and the probability of deadlocks.
Problems with Graph-Based Protocols:
- Locking More Data: Graph-based protocols impose an ordering on data item access. Transactions might need to lock more data than they strictly require to adhere to this order.
- Long Lock Holding: Transactions must hold locks until it’s guaranteed they won’t be needed again, leading to potentially long waits.
Problems with Timestamp-Based Protocols:
- Transaction Aborts: While avoiding waits, timestamp-based protocols rely on transaction aborts in certain situations. Aborting a long-duration transaction results in substantial wasted work, impacting both performance and user satisfaction.
Problems with Validation Protocols:
- Transaction Aborts: Similar to timestamp-based protocols, validation protocols use transaction aborts to ensure serializability, leading to the same issues with long-duration transactions.
Cascading Rollback:
- Exposure of uncommitted data increases the chance of cascading rollbacks. Avoiding this with strict 2PL (holding exclusive locks until the end) leads back to long waiting times.

Conclusion: Enforcing strict serializability inherently leads to either long-duration waits, aborts of long-duration transactions, or both. Theoretical results support this incompatibility. Snapshot isolation and optimistic concurrency control without read validation are partial solutions, but issues persist when conflicts are more likely (due to longer durations).

26.6.2 Concurrency Control

The core goal of concurrency control is to maintain database consistency. Serializability achieves this, but it’s not the only way. Schedules that are not serializable can still preserve consistency.

Example (Non-Serializable but Consistent):

Consider two accounts, $A$ and $B$ , with the consistency constraint that $A + B$ remains constant. The following schedule (Figure 26.5 in the PDF) is not conflict serializable, but it does preserve the sum:

T1                  | T2
--------------------|--------------------
read(A)             |
A := A - 50         |
write(A)            |
                    | read(B)
                    | B := B - 10
                    | write(B)
read(B)             |
B := B + 50         |
write(B)            |
                    | read(A)
                    | A := A + 10
                    | write(A)

Key Points about Correctness without Serializability:
1. Consistency Constraint Dependence: Correctness depends on the specific consistency constraints defined for the database.
2. Operation Properties: Correctness also depends on the properties of the operations performed by each transaction.

Alternative methods for ensuring database consistency: Spliting database into subdatabase. Treating some operation besides read and write as fundamental low-level operations.

26.6.3 Nested and Multilevel Transactions

To improve parallelism and handle failures more granularly, long-duration transactions can be structured as nested or multilevel transactions.

Nested Transaction: A transaction $T$ is composed of a set of subtransactions $T = {t_{1}, t_{2}, ..., t_{n}}$ and a partial order $P$ on $T$ .
- Subtransaction Aborts: A subtransaction $t_{i}$ can abort without forcing $T$ to abort. $T$ can either restart $t_{i}$ or choose not to execute it.
- Committing to the Parent: If $t_{i}$ commits, it commits to $T$ . This is not a permanent commit (unlike in standard transactions). $t_{i}$ can still be aborted (or require compensation) if $T$ aborts.
- Partial Order Preservation: The execution of $T$ must respect the partial order $P$ . If $t_{i} \to t_{j}$ is in the precedence graph, then $t_{j}$ cannot be in the transitive closure of $P$ .
Multilevel Transaction: If a subtransaction of $T$ is allowed to release locks upon completion, $T$ is considered a multilevel transaction.
Saga: A multilevel transaction representing a long-duration activity is often called a saga.
Nesting Levels: Nesting can have multiple levels, representing subtasks, sub-subtasks, and so on. The lowest level consists of standard read and write operations.
Example (Illustrating Higher-Level Operations):

The example from Figure 26.5 can be rewritten using subtransactions and increment/decrement operations to enhance concurrency:
- $T_{1}$ consists of:
  - $T_{1, 1}$ : subtracts 50 from $A$ .
  - $T_{1, 2}$ : adds 50 to $B$ .
- $T_{2}$ consists of:
  - $T_{2, 1}$ : subtracts 10 from $B$ .
  - $T_{2, 2}$ : adds 10 to $A$ .
No specific ordering is imposed on $T_{1, 1}$ , $T_{1, 2}$ , $T_{2, 1}$ , and $T_{2, 2}$ . Any execution of these subtransactions produces a correct result. The original schedule corresponds to < $T_{1, 1}$ , $T_{2, 1}$ , $T_{1, 2}$ , $T_{2, 2}$ >.

26.6.4 Compensating Transactions

To minimize long waits, uncommitted updates may be exposed to other transactions. However, this creates the potential for cascading rollbacks. Compensating transactions address this issue.

Concept: If a transaction $T$ is divided into subtransactions $t_{1}, t_{2}, ..., t_{n}$ , and a subtransaction $t_{i}$ commits (releasing its locks), a compensating transaction $c t_{i}$ is defined to undo the effects of $t_{i}$ if $T$ later aborts.
Execution Order: If subtransactions $t_{1}, ..., t_{k}$ have committed and $t_{k + 1}$ is executing when $T$ needs to abort, $t_{k + 1}$ is aborted, and compensating transactions $c t_{k}, ..., c t_{1}$ are executed in reverse order.
Examples:
- Figure 26.5 Example: If $T_{2}$ fails after $T_{2, 2}$ has released its locks, compensating transactions would subtract 10 from $A$ (undoing $T_{2, 2}$ ) and add 10 to $B$ (undoing $T_{2, 1}$ ).
- Database Insert with B+-tree Index: If a transaction inserts a record and updates a B+-tree index, the compensating transaction is a delete operation. The resulting B+-tree is consistent but might not have the exact same structure as before the insertion.
- Travel Reservation: If a long-duration transaction $T$ reserves flights, cars, and hotels, and the hotel reservation fails, compensation might involve deleting the old hotel reservation and making a new one, rather than aborting the entire transaction.

System Crash: It the system crashes during an outer-level transactions then subtransactions must be rolled back during recovery.

Defining Compensation: For simple operations (e.g., increment, B+-tree insertion), compensation is straightforward. For complex transactions, application programmers may need to define the compensating transactions. For interactive transactions, user interaction might be required to determine appropriate compensation.

26.6.5 Implementation Issues

Implementing long-duration transaction concepts presents several challenges:

Recoverability After System Crashes: Long-duration transactions must survive system crashes. This requires:
- Redoing committed subtransactions.
- Undoing or compensating for active short-duration subtransactions at the time of the crash.
- Restoring internal system data (lock tables, timestamps) related to long-duration transactions, which are typically stored in volatile memory. This necessitates logging changes to these internal data structures, in addition to database changes.
Logging of Updates for Large Data Items: When dealing with large data items (CAD designs, documents, etc.), storing both old and new values in the log is inefficient. Two main approaches are used:
- Operation Logging (Logical Logging): Only the operation performed and the data item name are logged. Each operation must have a defined inverse operation. Undo uses the inverse operation; redo uses the original operation. Recovery is more complex because redo and undo are not idempotent. Using physical redo and logical undo (as in Section 16.7) combines the benefits of logical logging with simpler recovery.
- Logging and Shadow Paging: Logging is used for smaller data items. Large data items are made recoverable using shadowing (copy-on-write) techniques. Overhead can be reduced by copying only the modified pages.

Non Critical data: It is good to exempt some noncritical data from logging.

Quartz 4

Explorer