Relaxing correctness criteria in real-time DBMSs

June 12, 2017 | Autor: Samia Bouzefrane | Categoria: Protocol Design, DATABASE MANAGEMENT SYSTEM, Temporal Constraints, Real Time, Concurrency Control, Cata

Share Embed

Denunciar este link

Descrição do Produto

Relaxing correctness Criteria in Real-Time DBMSs Samia SAAD-BOUZEFRANE & Bruno SADEG Laboratoire d’Informatique du Havre (LIH). Université du Havre 25, rue Philippe Lebon 76058 Le Havre (France) { Samia.Bouzefrane, Bruno.Sadeg }@univ-lehavre.fr

Abstract When manipulating traditional databases, Database Management Systems (DBMSs) must maintain data logical consistency while attempting to enhance concurrency execution of transactions. In real-time DBMSs, transactions have deadlines and data have often validity duration. Then real-time DBMSs must further allow transactions to respect their temporal constraints. However, these objectives are difficult to reach simultaneously. Therefore, some proposed works attempt to bring new concepts for relaxing transactions ACID properties in real-time context, that might be respected in traditional DBMSs. To this purpose, in this paper we introduce data-imprecision levels and transaction-deadline levels, which are materialized by respectively two notions : epsilon-data (ε-data) which takes into account the possible data imprecision and delta-deadline (∆-deadline) which takes into account possible data or transaction deadline overtaken. We then present transactions concurrency-control and scheduling (CC&S) protocols designed to control database temporal and logical inconsistencies in realtime DBMSs, and we show that these protocols bring some added value to real-time DBMSs and may outperform classical transaction CC&S protocols already proposed i.e. more transactions meet their deadlines

1. Introduction Database Management Systems (DBMSs) are useful to manage applications which manipulate large quantities of data. For decades, some other kind of DBMSs have appeared in order to deal with new current applications characteristics: data versions (Temporal DBMSs), active data (Active DBMSs), temporal constraints (Real-Time DBMSs), multimedia applications (MMDSMSs), and so on. We are interested by Real-Time DBMSs which are designed to manage applications where transactions have temporal constraints and that manipulate large quantities of

different kinds of data, which must be logically and temporally consistent in the database [1, 11, 10, 13]. These applications are typically multimedia applications. The objective of maintaining both logical and temporal consistency in the database being difficult to reach [10], some proposed work attempts to bring new concepts in order to relax the isolation transaction property i.e. serializability, usually considered as the transaction correctness criteria in traditional DBMSs [5, 4]. Indeed, there exists many kinds of applications where strict serializability is not desirable, e.g. that may tolerate data imprecision and transactions overtaken deadlines. This paper deals with DBMSs which are designed to manage these applications. For instance, in multimedia applications, the loss of some packets (composing an audio or a video stream) may not lead to serious consequences, in general. It is also tolerable that transactions arrive slightly late thus creating a little shift between the receipt of the sound and the image. In section 2, we present related work in this area. Section 3 is devoted to the presentation of ε-data and ∆-deadline notions, used to materialize the imprecisions cited below. In section 4, we show how multimedia applications could get much benefit from these concepts. In section 5, we show that the ε-data notion introduces new cases that modify the classical transaction concurrency-control and scheduling protocols. In order to apply ε-data and ∆-deadline notions, we describe, in section 6, protocols designed to bound database logical and temporal inconsistencies. An illustrative example is presented in section 7. In section 8, we give some elements of formal proof which demonstrate the correctness of the protocols introduced and the added value they bring in relation to the already proposed protocols. We conclude by giving some perspectives to this work.

2. Related work Since more than ten years, some work has addressed new systems that efficiently manage current applications whose characteristics are (1) to manipulate large quantities of data of various types and (2) to manage

actions which have temporal constraints. Multimedia applications might be the best example of such applications [3, 8]. These systems are real-time DBMSs, where a lot of researches have concerned transactions concurrency control and scheduling topic [1, 2, 10, 11, 13]. In such systems, the database correctness criteria depends not only on the respect of integrity rules while manipulating data, but on the respect of data and transactions deadlines [13] too. This objective being difficult to reach, researches have been done in order to find new criteria other than strict serializability, in order to deal with real-time DBMSs correctness. Indeed, in a lot of applications, transactions serializability criterion and/or timeliness may be relaxed [10]. According to the kind of transactions manipulated [10], the global objective of real-time DBMSs is (1) to guarantee absolutely that hard1 deadline transactions execute before their deadlines and (2) to minimize the number of firm2 deadline transactions that miss their deadlines and (3) to minimize the time latency of soft3 deadline transactions, while maintaining database logical consistency. However, transactions in real-time DBMSs need not to have ACID properties [5] (Atomicity, Consistency, Isolation, Durability) judged too restrictive in real-time context [10]. Researchers have then proposed techniques based on taking into account data semantics [13] that relax these properties. These works have led to the development of forced wait and data similarity policies [6, 13] and epsilon serializability criterion [9, 12]. In systems that apply Forced Wait policy, a transaction that is close to its deadline is forced to wait the availability of the next data value in order to resume its execution. In Data Similarity policy, a transaction manipulating a data of which the length of validity is almost finished, is allowed to continue its processing on this value provided two successive values in this system are similar [6]. These two previous concepts are related to periodic transactions only, whereas ε-serializability [9, 12] deals with the manipulation of imprecise data, which may be applied to periodic and non-periodic transactions. In the same way, we propose two concepts, that apply the imprecision concept to both data values and transactions deadlines. Indeed, we relax both data and transactions properties in order to help real-time applications to better reach their objective.

1

missing a hard-deadline leads to catastrophic consequences a firm deadline transaction that misses its deadline becomes useless. It is then aborted. 3 when a soft deadline transaction misses its deadline it is allowed to pursue its execution, with a less quality of service 2

3. Data imprecision and deadline levels ε-data is a notion close to epsilon serializability [9, 12]. It introduces some levels of data-imprecision in order to deal with some real-time applications that tradeoff consistency for timeliness, provided the amount of inconsistency introduced can be controlled. For example, in applications where approximate results obtained early that may be more useful than accurate results obtained late. The other notion, ∆-deadline, deals with some applications that tradeoff timeliness for consistency provided the amount of overtaken deadline can be limited. Actually, a lot of current applications may tolerate both bounded data imprecision and controlled transactions deadlines overtaken. In order to deal with some new applications both ε-data and ∆-deadline concepts are applied. ε-data concept deals with data absolute-value imprecision which is a percentage of the current value of the data. ∆-deadline represents the time overtaken tolerated by a transaction when it manipulates a real-time data. Let us give an illustrative example: let a data item d = 20. If d tolerates an imprecision ε then it’s called ε- data. Let ε = 10%, then every transaction that tries to modify the value of d can do it provided the new value of d be in the interval [20-2, 20+2]. ε-data and ∆deadline concepts alleviate the frequent restarts of transactions which then have better chances to meet their deadlines. We then denote by: - ε-data, a data item whose exact value is v and where each value in [v - ε, v + ε] is tolerated. - ∆-deadline, a transaction deadline which may not be strict, i.e. if the transaction has not terminated its execution on its deadline d, it can pursue its execution until d+∆. While taking into account ε and ∆ parameters, we got a new classification of real-time applications, that refines the classical one [10]: - ε-∆ applications : applications that may tolerate inaccuracy results obtained late, i.e. after deadlines expire (e.g., multimedia applications). - 0-∆ applications : applications that don' t tolerate any inaccuracy result, but the initial deadline may be overtaken by ∆ units of time (e.g., transport reservations, bank transactions, etc.). - ε-0 applications : applications whose results must be obtained on time, even though these results are inaccurate or incomplete (e.g., emergency-situation applications). - 0-0 applications : applications where accurate results must be obtained on time (e.g., aeronautic control systems and robotic factories).

The remainder of this paper focuses on ε-∆ applications category whose the best representative are multimedia applications.

4. ε-data and ∆-deadline notions in multimedia applications Multimedia (MM) applications such as videoconferencing and video-on-demand impose some constraints on the overall system components (large quantities of data, severe temporal constraints, due to the continuous nature of audio and video streams, …). Among temporal constraints, some of them result from synchronization requirements: - intra-stream synchronization: latency imposes an upper bound on the maximum delay we can tolerate. Data that exceeds this delay becomes useless and is ignored. - Inter-stream synchronization: deals with temporal relationships which may exist between different streams, often called lip synchronization. For example, in video-conferencing, the information is created in real-time, e.g. from a camera, and should be consumed in real-time, e.g. by a monitor. The data produced is valid for a period of time and often once this time is elapsed the data becomes useless. However, many of these applications may tolerate some bounded imprecision (an image received is still useful even if some pixels are lost). Furthermore, if actually some shift occurs when receiving the sound and the image, due to transaction-deadline missing and if one can control this shift, i.e., overtaken deadlines, the scene will still be exploitable. Therefore, in such applications ε-data and ∆deadline notions might be very useful. Indeed, an image where some pixels are lost can be considered as an εdata. On the other hand, when a shift occurs at the arrivals of audio and video streams (lip synchronization), we can consider that transactions which transport these streams have ∆-deadlines, where ∆ is fixed. DBMSs on which repose such applications must be designed differently from the traditional ones. Indeed, in traditional DBMSs, the most important problem to solve is to allow transactions execute concurrently while insuring the database logical consistency. Transactions have then assigned ACID properties. In real-time DBMSs, these properties, notably isolation property (i.e., serializability) might be relaxed in order to deal with imprecise data. In some specific real-time applications such as multimedia applications, not only data might be imprecise but transactions deadlines might be relaxed too. That is, transactions are allowed to execute beyond their deadlines. The introduction of ε-data and ∆deadline notions leads to modify traditional transaction

compatibility matrix in order to deal with imprecise data and to modify traditional scheduling algorithms in order to deal with transaction deadline overtaken. This is the purpose of the next sections.

5. The ε-∆ ∆ concurrency control and scheduling protocol 5.1. Concurrency-control techniques In DBMSs, transaction concurrency control techniques are mainly based on two policies: - optimistic techniques: in the systems where these techniques are implemented, all transactions that want to access the database are allowed to do it. But write operations are delayed until the transaction-validation step. Besides, before its commit, every transaction must pass a certification test in order to detect if it conflicts with other transactions that have been already committed. - pessimistic techniques: algorithms based on these techniques avoid the execution of concurrent transactions when some potential conflicts exist. This method is said pessimistic because the underlying hypothesis it uses considers that any pair of transactions that execute concurrently may potentially conflict. One of the two transactions waits until the validation of the other. Our ε-∆-protocol used to resolve transaction conflicts may be considered as a tradeoff between pessimistic and optimistic techniques. Indeed, it is pessimistic because it uses blocking techniques to resolve conflicts, but, on the other hand it is optimistic since it allows some transactions, a priori incompatible, to execute simultaneously, under certain conditions. 5.2. Compatibility Matrix When using ε-data and ∆-deadline notions, the isolation level tolerated is not maximal. The isolation level may correspond to Repeatable Read of SQL-92 norm, that is, each read (respectively write) transaction may access a data while an other transaction is writing (respectively reading) it. Therefore, locks management used by transactions is somewhat different from the one already used in classical DBMSs. That is, a transaction T which wants to get a lock on a data already locked by another transaction can obtain it under certain conditions (see C1 and C2 cases bellow). We describe, in table 1, a compatibility matrix between query transactions Q, and update transactions U (composed of two parts : R that

reads the current data value and W that writes a new data value).

conditions) that allow conflicting transactions to execute concurrently (see table 1).

Table 1. Lock compatibility table

Q Q OK Ru OK Wu C2

Ru OK OK X

Wu C1 X X

In table 1, a transaction on the column corresponds to the one that holds a lock on an object. A transaction on a row corresponds to the one that requests for a lock on the same object. Three cases are possible : Cases ‘X’ : a concurrent execution is not tolerated, Cases ‘OK’ : no conflict occurs, C1 and C2 cases correspond to transactions that may hold locks under certain conditions. - Case C1: if a write transaction holds a lock on an object O, and a query transaction requests the lock on O, it could get it, provided that the quantity of inconsistency it will introduce is less than a specified bound ε, and its extended deadline is not missed. - Case C2: if a query transaction Q holds a lock on O, and a write transaction W requests the lock on O, it could get it, provided its extended deadline is not missed and the sum of inconsistency quantities, introduced by W and by other transactions that request to writelock O, is less than a specified limit ε.

6. Protocol description 6.1. Concurrency control and scheduling The transactions concurrent execution is controlled thanks to a transaction manager using the scheme described in table 1. Unlike traditional transaction managers, it allows two transactions which are a priori incompatible (read/write or write/read) to be executed simultaneously. Indeed, it integrates a mechanism that controls the possible inconsistency introduced in the database. The concurrency control protocol may be a blocking protocol (for example 2PL-HP [1]), an extended 2PL to real-time databases context, which is enhanced to control inconsistencies. In this case, the control takes place when the conflict occurs. Besides, the transaction manager cooperates with the operating system scheduler in order to eliminate the useless transactions (firm transactions that miss their initial deadlines or soft transactions that miss their ∆deadlines). Intuitively, this protocol allows better performance than existing protocols [6, 10]. That is, more transactions will meet their deadlines. Indeed, it incorporates two additional cases (C1 and C2 locking

6.2 Algorithms We assume that the system has a single processor with a resident database and that all transactions are periodic and are scheduled with Earliest Deadline First algorithm [5]. Moreover, the transactions model is the update-inworkspace one, that is, before a transaction commits, it reads and updates data items in its private workspace: data items are written into the database only if the transaction commits. With the ε-∆-protocol, when a conflict occurs, the two conflicting transactions may continue their execution provided the amount of inconsistency introduced by the conflict is lower than ε, on one hand, and if the initial transaction deadline expires before it commits, it is allowed resume its execution until the initial deadline is overtaken by a quantity of time ∆. We use the following notations: Wε∆: the write part of an update transaction Uε∆. Qε∆: a query transaction. val_to_write: a value to be written by a Wε∆ transaction which conflicts with Qε∆ transactions. val_to_read: a value to be read by a Qε∆ transaction which conflicts with Wε∆ transactions. ε (d): the quantity of imprecision tolerated by the value v of a data item d. It belongs to the interval [v-ε, v+ ε]. εw(d): the quantity of imprecision introduced by an update transaction that wants to writelock a data item d already readlocked by a query transaction. deadl (Ti): denotes the initial deadline of Ti. ∆-deadline (Ti): denotes the initial deadline of Ti extended by a quantity of time ∆. 6.2.1. Locking conditions We recall that the compatibility matrix (see table 1) shows two special cases: - case C1 concerns a situation where a data item d is write-locked by a Wε∆ transaction and a query transaction Qε∆ requests to read-lock d. Instead of blocking or aborting Qiε∆ transactions as in classical protocols, in our protocol all Qiε∆ transactions pursue their execution concurrently with Wε∆ provided that the difference between the value written by Wε∆ and the value read by Qiε∆ transactions does not exceed ε. Otherwise, if the value written by Wε∆ is out of range then the transaction manager blocks all the Qiε∆ transactions that conflict with Wε∆ and that possess ∆-

deadlines greater than that of Wε∆. Hence, when a Qiε∆ transaction is awakened by the manager (because Wε∆ commits or misses its deadline), the Qiε∆ deadline may not be expired. At this instant, the execution interval of Qiε∆ transactions is relatively large, hence increasing the number of Qiε∆ transactions that meet their deadlines. - case C2 concerns a situation where a data item d is read-locked by a Qε∆ transaction and an update transaction Wε∆ requests to write-lock d. To handle this case in our protocol, all Wiε∆ transactions pursue their execution provided the sum of fluctuations introduced by these transactions does not exceed ε. Otherwise, as soon as a fluctuation introduced by a Wiε∆ transaction makes the sum greater than ε, then it is blocked provided that its ∆-deadline is greater than that of Qε∆. Hence, when a Wiε∆ transaction is awakened by the manager (because Qε∆ commits or misses its deadline), the ∆-deadline of Wiε∆ may not be expired. At this instant, the execution interval of Wiε∆ is relatively large. This protocol hence increases the number of Wiε∆ transactions that meet their deadlines. 6.2.2. Scheduling algorithm Due to the notion of ∆-deadline associated to transactions, the EDF scheduling protocol is based on two deadline levels: - initial deadline (0-deadline) level, the scheduling protocol works then as a classical EDF algorithm, that is, the transaction with the lowest deadline is assigned the processor. - extended deadline (∆-deadline) level: which is the originality of our EDF algorithm. It allows a transaction which is close to its initial deadline to be reconsidered with its ∆-deadline by EDF algorithm instead of being rejected. Then it is re-queued in the ready-transactions list in the appropriate order (see figure 1). Note that EDF algorithm privileges transactions initial deadlines because the application ideal QoS (especially in multimedia applications) is reached when all transactions commit before their initial deadlines. Otherwise, an acceptable QoS is obtained with transactions executed before their ∆-deadlines.

Transaction Manager Transactions that miss their extended deadline

Blocked-transactions queue

A transactionthat may miss its strictdeadline Ready-transactions queue A transactionthat does not reach its deadline(strict orextended) Processor Scheduler

Figure 1. The scheduler-manager cooperation

The following extended EDF algorithm cooperates with the transaction manager in order to attribute locks to transactions, or to commit transactions or to abort transactions according to the occurred situation. This cooperation is done through some procedure entries, presented in the transaction manager algorithm. EDF_Scheduler (transactions) /* let t a current instant and active_transaction the current transaction processed by the CPU*/ /* the variable deadline(T) stores the deadline of T, that may correspond to the initial or to the extended deadline */ /* the procedures called in this algorithm are described in the transaction-manager algorithm */ BEGIN ∀ Ti an arrived transaction, insert Ti in the ready queue with deadline (Ti)=deadl(Ti); IF no active transaction /* no transaction processed by the CPU */ THEN active_transaction := head (ready queue); ENDIF IF active_transaction requets a lock on a data item d THEN transaction_manager.Allocate (active_transaction, d, access mode ); ENDIF IF active_transaction is finished /*the transaction must commit*/ THEN transaction_manager.Commit (active_transaction); ENDIF IF ∃ T : deadl (T)=t /* the initial deadline of T is reached */ THEN deadline(T):= ∆-deadline(T); /* the deadline of T is extended by ∆(T)*/ ENDIF IF ∃ T : ∆-deadline(T)=t /* the extended deadline of T is reached */ THEN transaction_manager.Abort (T); ENDIF Let T = head (ready queue) ; IF (deadline(active_transaction) > deadline(T)) THEN insert active_transaction in the ready queue;

active_transaction := T; ENDIF END

6.2.3 Transaction manager algorithm The transaction manager (TM) is called by the scheduler in three cases : - when a transaction attempts to hold a lock on a data item. In this case, TM serves the transaction according to situations described in table1. - when a transaction commits, or - when a transaction misses its deadline. TM activates possible waiting transactions either when a transaction commits or when it misses its deadline. Entry point : Allocate (transaction_T, d, access mode ) Case C1 : a conflict of type C1 occurs when a data item d is writelocked by a transaction Wε∆ and a query transaction Qε∆ requests to readlock it IF (transaction_T = Qε∆) THEN BEGIN IF (ε (d)> 0) and (val_to_write(d) ∈ [val_to_read(d) - ε(d), val_to_read(d )+ ε(d)]) THEN /* ε-data */ Qε∆ pursues its execution concurrently with Wε∆. ELSE /* ε = 0 or ε-data with ε out of time */ IF ∆-deadline (Wε∆ ) < ∆-deadline (Qε∆ ) THEN insert Qε∆ in the blocking queue ; ELSE transaction_manager.Abort (Wε∆ ) ; ENDIF ENDIF END ENDIF Case C2 : Let d be a data item readlocked by a transaction Qε∆ and Wε∆ an update transaction that requests to writelock d. IF (transaction_T = Wε∆) THEN BEGIN Let εw = | val_to_write - val_to_read | / |val_to_read| ; sum_eps = sum_eps + εw /* sum_eps is set to 0 when Qε∆ starts its execution */ IF (ε (d)> 0) and (sum_eps ≤ ε (d)) THEN Wε∆ pursues its execution concurrently with Qε∆ ELSE IF ∆-deadline (Qε∆ ) < ∆-deadline (Wε∆ ) THEN insert W ε∆ in the blocking queue; ELSE transaction_manager. Abort (Qε∆); ENDIF ENDIF END Case OK : since the requested lock is shared, it is attributed to transaction_T. Hence, transaction_T pursues its execution Case X : forbidden cases

IF (transaction_T=Wε∆ ) and (∃ Tw' (active or ready) thatwritelocks d) THEN IF ∆-deadline(transaction_T) > ∆-deadline (Tw' ); THEN insert transaction_T in the blocking queue ; ELSE transaction_manager.Abort (transaction_T ) ; ENDIF ENDIF

Entry point : Commit (transaction_T) /* when a transaction commits */ BEGIN IF (transaction_T = W ε∆ ) THEN ∀ Q i ε∆ ∈ blocking queue (due to a conflict of type C1) : insert Q i ε∆ in the ready queue ; /* TM unblocks Q i ε∆*/ ENDIF IF (transaction_T = Q ε∆ ) THEN ∀ Wi ε∆ ∈ blocking queue (due to a conflict of type C2) : insert W i ε∆ in the ready queue ; /* TM unblocks W i ε∆*/ ENDIF END

Entry point : Abort (transaction_T) /* when a transaction is aborted */ BEGIN IF (transaction_T = Wε∆ ) THEN Abort_transaction(Wε∆); ∀ Qi ε∆ ∈ blocking queue (due to a conflict of type C1) : insert Q i ε∆ in the ready queue ; /* TM unblocks Q i ε∆*/ ENDIF IF (transaction_T = Qε∆ ) THEN Abort_transaction(Qε∆); ∀ Wi ε∆ ∈ blocking queue (due to a conflict of type C2) : insert W i ε∆ in the ready queue ; /* TM unblocks W i ε∆*/ ENDIF END

Abort_transaction(T) is a procedure that consists of extracting T transaction from its queue and releasing all locks already held by T.

7. An illustrative example Let d=15 be a data item in the database. ε(d)=0.3 (i.e., 30% of the current value of d). The example of figure 2 illustrates two situations : - a conflict of type C2 that occurs when TQ1 a query transaction readlocks d, and during its execution, three transactions denoted TW1, TW2 and TW3 with

-

higher priorities preempt TQ1 and request to writelock d with values 16, 18 and 20 respectively. and a conflict of type C1 that occurs when TW3 an update transaction write-locks d, and during its execution two query transactions TQ2 and TQ3 with higher priorities pre-empt TW3 and want to readlock d.

Initial deadlines of transactions are set respectively to: deadl(TQ1) = 6, deadl(TW1) = 6, deadl(TW2) = 7, deadl(TW3) = 11, deadl(TQ2)=deadl(TQ3)=12, and let ∆ be equal to 1 for all transactions, that is, each transaction is allowed to pursue its execution beyond its initial deadline by 1 time unit (actually, it would be better to set ∆ to a percentage of each transaction duration). We obtain : - at time 0, a variable, sum_eps that accumulates fluctuations εWi is set to 0 for data item d and a read transaction TQ1 begins to execute. During its execution, TQ1 requests to readlock d. It can do it because no transaction accesses d. Therefore, TQ1 reads the value 15. - at time 1, an update transaction TW1 with a higher priority than that of TQ1 pre-empts TQ1 and begins to execute. During this time, TW1 requests to writelock d. A conflict of type C2 occurs. Hence, TM computes the fluctuation εW1 brought by Tw1 which is equal to εW1 (d) = (16-15)/15=0.06 = 6% and checks then if sum_eps + εW1 (d) = 0.06 is ≤ ε(d). As 0.06 < 0.30, TW1 is allowed to writelock d and to resume its execution. - at time 2, another update transaction TW2, with a higher priority than that of TW1 pre-empts TW1. During its execution, TW2 requests to writelock d. The principle in case C2 is to accumulate fluctuations brought by each write transaction (actually update transaction) as soon as the read transaction has not been committed (or aborted). TW2 is allowed to write-lock d because sum_eps + εW2 (d) = 0.06+0.2 ε(x) if p=r and q=w, where (wi

Lihat lebih banyak...

Relaxing correctness criteria in real-time DBMSs

Descrição do Produto

Comentários