Transactions
A "transaction" is defined as any one execution of a user
program in a DBMS. It is a set of actions on some data object.
"Data
objects" are the units in which programs read or write information. The units
could be pages, records, and so on, but this is dependent on the DBMS.
There are four important properties of transactions that a DBMS must ensure
to maintain data in the face of concurrent access and system failures:
1. Users should be able to regard the
execution of each transaction as atomic: do whole or do nothing. Either all
actions are carried out or none are. Users should not have to worry about the
effect of incomplete transactions (say, when a system crash occurs).
2. Each transaction, run by itself
with no concurrent execution of other transactions, must preserve the
consistency of the database. This property is called consistency, and the DBMS assumes that it holds for each
transaction. Keeping transaction consistent is the responsibility of the person
writing the transaction query. F you do not write a proper transaction (i.e. if
the logic used is incorrect) then though the transaction will be atomic but
will not be consistent.
3. Users should be able to understand
a transaction without considering the effect of other concurrently executing
transactions, even if the DBMS interleaves the actions of several transactions
for performance reasons. This property is referred to as isolation: Transactions are isolated, or protected, from the
effects of concurrently scheduling other transactions.
4. Once the DBMS informs the user
that a transaction has been successfully completed, its effects should persist
even if the system crashes before all its changes are reflected on disk. This
property is called durability.
Schedule
A transaction is seen by the DBMS as
a series, or list, of actions. The actions that can be executed by a
transaction include reads and writes of database objects.
In addition to reading and writing,
each transaction must specify as its final action either commit (i.e., complete
successfully) or abort (i.e., terminate and undo all the actions carried out
far).
A schedule is a list of actions (reading, writing, aborting, or
committing) from a set of transactions, and the order in which two actions of a
transaction T appear in a schedule must be the same as the order in which they
appear in T. E.g. Let transaction T1 consist of set of actions (a, b, c, d)
then b will always come after a and before c.
A schedule that contains either an
abort or a commit for each transaction whose actions are listed in it is called
a complete schedule.
If the actions of different
transactions are not interleaved i.e. transactions are executed from start to
finish, one by one then the schedule is called as a serial schedule.
We want to
have effect of serial schedule in the light of concurrency. We want
serializable schedule.
Due to OS’s short-term scheduler we
cannot decide which process will be given priority, what will be the time slot
how many statements will be executed and how much time each statement will take
for execution. Briefly, you cannot control the pace with which transactions
move ahead. Any schedule S is acceptable if it gives effect same as executing
transactions in isolation.
A serializable schedule over a set S is a schedule whose effect on
any consistent database instance is guaranteed to be identical to that of some
complete serial schedule over S, (where S is a set of committed transactions).
That is, the database instance that results from executing the given schedule
is identical to the database instance that will result from executing the
transactions in some serial order.
Anomalies
1. Dirty Read (WR)
The first source of anomalies is that
a transaction T2 could read a database object A that has been modified by
another transaction T1, which has not yet committed. Such a read is called a
dirty read.
E.g. consider two transactions
T1:
Transfer Rs. 100/- from account A to account B.
T2: Calculate 1% interest on each
account.
Suppose T1 was invoked by a low
priority user and Y2 is run by a user with high priority. When T2 came, T1 was
evicted and T2 was given full control. The complete schedule is as follows.
T1
|
T2
|
Comment
|
R(A)
|
T1
reads values of A. A = 1000
|
|
W(A)
|
T1
updates the value of A.
A = 900
|
|
R(A)
|
A = 900
|
|
W(A)
|
A = 909
|
|
R(B)
|
B = 400
|
|
W(B)
|
B = 404
|
|
Commit
|
||
R(B)
|
B = 404
|
|
W(B)
|
B = 504
|
|
Commit
|
And if T1 aborts then -
T1
|
T2
|
Comment
|
R(A)
|
T1
reads values of A. A = 1000
|
|
W(A)
|
T1
updates the value of A.
A = 900
|
|
R(A)
|
A = 900
|
|
W(A)
|
A = 909
|
|
R(B)
|
B = 400
|
|
W(B)
|
B = 404
|
|
Commit
|
||
Abort
|
A =
1009
|
The result of this schedule is
different from any result that we would get by running one of the two
transactions first and then the other. The problem can be traced to the fact
that the value of A written by T1 is read by T2 before T1 has completed all its
changes.
2. Unrepeatable Reads (RW Conflicts)
The second way in which anomalous
behavior could result is that a transaction T2 could change the value of an
object A that has been read by a transaction T1, while T1 is still in progress.
This situation could not arise in a serial execution of two transactions; it is
called an unrepeatable read.T1 is
still in progress. If T1 tries to read the value of A again, it will get a
different result, even though it has not modified A in the meantime.
Another problem is illustrated by
this example.
Let T1: A = A + 1 and T1: A = A -1 1
T1
|
T2
|
Comment
|
R(A)
|
A = 5
|
|
R(A)
|
A = 5
|
|
W(A)
|
A = 6
|
|
W(A)
|
A = 4
|
|
Commit
|
||
Commit
|
If we execute T1 and T2 in any serial
order then effectively the value of A remains the same but here it is
different. Hence the database is not inconsistent state.
3. Overwriting Uncommitted Data (WW Conflicts)
This problem occurs when you loose
data written by one of the transactions. It is lost and never recorded. Suppose
that A and B are two employees, and their salaries must be kept equal.
Transaction T1 sets their salaries to
$1,000 and transaction T2 sets their salaries to $2,000. Note that neither
transaction reads a salary value before writing it, such a write is called a
blind write.
If we execute these in the serial
order T1 followed by T 2, both receive the salary $2,000; the serial order T2
followed by T 1 gives each the salary $1,000. Either of these is acceptable
from a consistency standpoint.
Now, consider the following
interleaving of the actions of T1 and T2.
T1
|
T2
|
Comment
|
W(A)
|
A =
1000
|
|
W(B)
|
B = 2000
|
|
W(B)
|
B = 1000
|
|
W(A)
|
A = 2000
|
|
Commit
|
||
Commit
|
The result is not identical to the
result of either of the two possible serial executions, and the interleaved
schedule is therefore not serializable. It violates the desired consistency
criterion that the two salaries must be equal.
No comments:
Post a Comment
Your comments are very much valuable for us. Thanks for giving your precious time.