14 Transactions
14.1 Transaction Concept
TRANSACTION
- a unit of program execution that accesses and possibly updates various data items.- usually initiated by a data-manipulation language, or programming language w/embedded database accesses
- transactions are grouped btw function calls, begin transaction and end transaction
- one transaction is every step btw the begin and end calls
- a transaction must execute in it entirety or not at all
\[\\[.1cm]\]
ATOMICITY
- this ‘all-or-none’ property of transactionISOLATION
- ensuring that transactions operate properly w/o interference from concurrently executing database statementsDURABILITY
- ensuring that the transaction’s actions remain even after system crashesCONSISTENCY
- execution of a transaction in isolation preserves the consistency of the database
14.2 A Simple Transaction Model
consistency is the responsibility of the application programmer who codes the transaction
atomicity and durability is handled by a component of the database called the recovery system
isolation is maintained by the concurrency-control system component of a databse.
- isolation makes sure that concurrenct executions results in the same resulting state as the end state where all transactions were ran in a serial order, one after another
14.3 Storages
- types of storage:
- volatile storage - main memory and cache memory do not usually survive system crashes but has fast access to data
- non-volatiel storage - secondary and tertiary storage survives system crashes but has slower access to data
- stable storage - storage that replicates info. in several non-volatile disks with independent failure modes; info in stable storage is theoretically never lost
\[\\[.1cm]\]
for a transaction to be durable, changes need to be written to stable storage
for a transaction to be atomic, log records need to be written to stable storage before any changes are made to database
14.4 Atomicity and Durability
ABORTED
- to halt execution of a transcationROLLED BACK
- when changes caused by an aborted transaction have been undone- this is typically done by recovery system and a log
- first, record the identifier of the transaction performing the change; the identifier of the data item being modified; the old values; and then the new values
- after everything is recorded, the database is modified
\[\\[.1cm]\]
c.COMMITTED
- a successfully execution of a transaction
d.COMPENSATING TRANSACTION
- the only way to undo the effects of a committed transaction
- not always possible to create such a compensating transaction so usually the user writes and executes a compensating transaction
\[\\[.1cm]\]
- a successful execution is one where the transaction enters the committed state and not the other 4 states
14.5 Isolation
allowing multiple transactions to update data at once can cause consistency problems
but concurrency has two big benefits:
- Improved throughput and resource utilization - the num. of transactions executed in a given time is higher and the processor and disk spends less time idle
- Recudes waiting time - some transactions operate on different parts of the databases and are short, so its better to let them share CPU cycles and disk accesses
\[\\[.1cm]\]
CONCURRENCY-CONTROL SCHEMES
- variety of mechanism employed by the database system to control the interaction among concurrent transactions and to prevent them from destroying the consistency of the database- individually, each transaction may be correct but when ran concurrently, they still can violated the isolation property and cause inconsistency
\[\\[.1cm]\]
Schedules - execution sequences that helps identify isolation-abiding transactions and database consistency
Serial v. Concurrent Schedules:
- serial schedules are one where the instructions belonging to one single transaction appear together in that schedule
- concurrent schedules are ones where instructions are executed concurrently
- Ex.
\[\\[.1cm]\]
\[\\[.1cm]\]
\[\\[.1cm]\]
SERIALIZABLE
- if a concurrent schedule can be equivalent to s serial schedule
14.6 Serializability
- Different forms of schedule equivalence give rise to
the notions of:
- conflict serializability
- view serializability
\[\\[.1cm]\]
conflict - two instruction commands are in conflict iff their transactions are different and at least one of the commands is a write(…) instruction
Ex. 1) i = read(Q), j = read(Q). l and j don’t conflict. 2) i = read(Q), j = write(Q). They conflict. 3. i = write(Q), j = read(Q). They conflict 4. i = write(Q), j = write(Q). They conflict
If i and j are consecutive instructions are from different transactions and do not conflict, we can swap the order of i and j to produce a new schedule
Ex. • Swap the read(B) instruction of T1 with the read(A) instruction of T2. • Swap the write(B) instruction of T1 with the write(A) instruction of T2. • Swap the write(B) instruction of T1 with the read(A) instruction of T2.
\[\\[.1cm]\]
CONFLICT EQUIVALENT
- when a schedule can be transformed into another schedule by a series of swaps on nonflicting instructionsa schedule S is conflict serializable if it is conflict equivalent to a serial schedule
Schedule 3 can be transformed into Schedule 6, a serial schedule where T2 follows T1, by series of swaps of non-conflicting instructions. Therefore Schedule 3 is conflict serializable.
PRECEDENCE GRAPH
- a direct graph used for testing serializability where the vertices are the transactions
- Ex.
14.7 Isolation and Atomicity
If T1 fails, we need to undo the effects of T1 to ensure atomicity
IF T2 is dependent on T1 (i.e. T2 reads data written by T2), then T2 must also be aborted
c.RECOVERABLE SCHEDULE
- a schedule where for each pair of transactions T1 and T2, where T2 reads data previously written by T1, the commit for T1 must appear before the commit for T2
- EX.
CASCADING ROLLBACK
– a single transaction failure leads to a series of
transaction rollbacks. Consider the following schedule where none of
the transactions has yet committed (so the schedule is recoverable)
- EX.
\[\\[.1cm]\]
CASCADELESS SCHEDULES
— schedules where cascading rollbacks cannot occur- For each pair of transactions T1 and T2 such that T2 reads a data item previously written by T1, the commit of T1 must appear before the read operation of T2
- every cascadeless schedule is also recoverable
- best to restrict the schedules to those that are cascadeless
14.8 Transaction Isolation Levels
- database systems offer isolation levels weaker than serializability to allow less re- striction of concurrency and thus improved performance
- this introduces a risk of inconsistency that some applications find acceptable
\[\\[.1cm]\] b. SQL allows a transaction to specify the level it may executed in
- the isolation levels in SQL are
- serializable is the default & only allows serializable execution
- repeatable read allows only commited data to be read and requires that repeated reads of the same record must return the same value
- read committed allows only commited data to be read & successive reads of record may return different (but committed) values
- read uncommitted allows uncommitted data to be read; lowest isolation level allowed by SQL