ADM: An Active Deductive XML Database System

June 19, 2017 | Autor: Guillermo Luna | Categoria: Semantic Web, XML Database, Active Database, Conceptual Model, ECA Rules, Distributed Coordination
Share Embed


Descrição do Produto

ADM: An Active Deductive XML Database System Oscar Olmedo-Aguirre, Karina Escobar-V´ azquez, Giner Alor-Hern´ andez, and Guillermo Morales-Luna Computer Science, CINVESTAV-IPN, Av. IPN 2508, Mexico City, Mexico, {oolmedo,gmorales}@cs.cinvestav.mx, {kescobar,galor}@computacion.cs.cinvestav.mx

Abstract. As XML is becoming widely accepted as a mean of storing, searching and extracting information, a larger number of Web applications will require conceptual models and administrative tools to organize their collections of documents. Recently, event-condition-action (ECA) rules have been proposed to provide reactive functionality into XML document databases. However, logical inference mechanisms to deliver multiagent-based applications remain unconsidered in those models. In this paper, we introduce ADM, an active deductive XML database model that extends XML with logical variables, logical procedures and ECA rules. ADM has been partially implemented in an open distributed coordination architecture written in Java. Besides of coupling the rational and reactive behavioral aspects into a simple and uniform model, a major contribution of this work is the introduction of sequential and parallel rule composition as an effective strategy to address the problem of scheduling rule selection and execution. Keywords: XML, Semantic Web, Deductive Databases, Active Databases.

1

Introduction

The motivation behind this work is in the increasing use of the Extensible Markup Language, XML, as a mean of storing structured information that cannot be conveniently stored using the available database technology. Examples of handling XML document collections are recent Web-based recommendation systems that have increasingly adopted the publish/subscribe model to disseminate relevant new information among an users community. Actually, in the active database community it has been proposed to extend the persistent storage functionality of databases to support rule production systems [2]. By extending the relational database model with expert systems technology, a number of applicationindependent tasks can be undertaken by the Active Database Management System, active DBMS, e.g. enforcement of data integrity constraints and data protection, versioning maintenance and data monitoring for knowledge discovery and acquisition. Recently, XML active DBMS have been suggested to mirror major results of active DBMS to the XML document management domain [1]. R. Monroy et al. (Eds.): MICAI 2004, LNAI 2972, pp. 139–148, 2004. c Springer-Verlag Berlin Heidelberg 2004 

140

O. Olmedo-Aguirre et al.

Another substantial amount of research has been devoted to deductive database systems [6]. Those systems are similar to relational database systems in that both are passive, responding only to user queries. Deductive DBMS extend a relational DBMS with a Prolog-like inference engine to answer possibly recursive queries in order to derive conclusions logically entailed in the database content. As the expressive power of both the query model and the language becomes richer then the deductive DBMS is a more convenient mean to describe the complex conditions used in production rules. The DARPA Agent Markup Language, DAML [4], has been developed as an extension of XML by using ontologies to describe objects and their relations to other objects. We follow, instead, a theorem prover approach. In this work, we introduce ADM to address the problem of coupling XML active databases with deductive databases to support multiagent-based applications. The adopted approach consists on extending XML with logical variables, logical procedures and ECA rules. The extensions allow a user to annotate XML documents with logical constraints among other related features to enforce integrity. Event annotations allow an user to describe where the insertion or deletion events take place, making available the content of the document involved to test the integrity constraints. The experimental system developed so far can be considered an open distributed coordination architecture that enables thirdparty applications to use the already created XML documents. It provides basic language constructs to coordinate document content flow in Web-based applications. ADM is built upon LogCIN-XML, an experimental deductive XML database developed by the first two authors of the current paper, that in turn has been written in Prolog-Cafe [3], a WAM-like Prolog engine that sits on the Java Virtual Machine. While the Prolog-Cafe provides the basic automated inferencing capabilities, a set of Java servlets provide the core functionality for coordination and communication among clients and servers.

2

Extensional Databases

XML extensional databases are collections of standard XML documents maintained on a distributed directory structure across the Internet. ADM preserves the open character of extensional databases by keeping apart language extensions from the data created by third-party applications. From a logic programming point of view, XML elements roughly correspond to ground terms. Nonetheless, XML elements have a more complex structure than those of first order logical languages due to the presence of a list of attribute-value pairs. Assuming that such a list is in ascending order by the name of the attribute, the structure of an XML element can be rewritten into a syntactically equivalent Prolog term. As an example, Table 1 shows a collection of XML documents containing family relations information in which all documents share the same simple structure. The root element fatherof contains two child elements father and son describing the role of the participants. Thus, bob and tom are two persons related by the fatherof relation, having the roles father and son respectively.

ADM: An Active Deductive XML Database System

141

Table 1. A family relations extensional XML database. fatherof father bob/father son joe/son/fatherof fatherof father bob/father son tom/son/fatherof fatherof father joe/father son doe/son/fatherof

3

Intentional Databases

Intentional databases are collections of logical procedures and ECA rules called programs. The reason for maintaining language extensions isolated from data is twofold: firstly, to create a low-level conceptual layer of the fundamental concepts that underlie the application, and secondly, to create an upper-level conceptual layer to describe the integrity constraints and consistency-preserving rules that can be logically derived from the extensional database contents. XML elements are extended with logical variables to introduce the notion of XML terms. Logical variables are used to define the integrity constraints imposed on the document contents and to formulate queries to bind values to variables. Logical variables can be of either type string (prefixed by a ’$’) or term (prefixed by a ’#’) and they may occur in elements and in attributes. The function var : XMLTerm → P XMLVariable is defined to obtain the set of variables occurring in a term (here P stands for the power set operator). Integrity constraints are introduced by means of logical procedures. A logical procedure has the form b b1 = T1 · · · bn = Tn B1 · · · Bm /b comprising a head and a body. The head b b1 = T1 · · · bn = Tn  consists of a procedure name b and a list of parameter names b1 · · · bn associated with their respective terms T1 · · · Tn . The body B1 · · · Bm consists of a sequence of procedure calls. A procedure call has either the form b b1 = S1 · · · bn = Sn / or the form b b1 = S1 · · · bn = Sn A1 · · · Ak /b where S1 · · · Sn are the XML terms passed to the procedure and A1 , . . . , Am are XML terms. The former case corresponds to a call to a defined procedure, whereas the latter case corresponds to a consult to the extensional database. In any case, unification is used for parameter passing, and equality comparison between documents with complex structure. The consistency-preserving and reactive behavior are described by ECA rules. Any ECA rule has the structure given in Table 2. When a collection of XML documents changes by inserting or deleting a document, an event E occurs, and then one or more rules may be selected. In this case, condition C is checked and if it is satisfied within the database, then the actions of deleting B or inserting A the argument document are executed. Ordering of actions can be defined by composing rules R either sequentially or in parallel. As an example of an intentional database, Table 3 shows the logical procedures fatherof and cousinof and the ECA rule partyInvitation. The logical procedures describe the father-of and cousin-of relations between two persons, called in the procedure the subject and the object, respectively. In the logical

142

O. Olmedo-Aguirre et al. Table 2. ECA rule structure. rule on E/onif C/if do  delete B/deleteopt insert A/insertopt seq R/seqopt par R/paropt /do /rule

procedure brotherof, logical variables $X and $Y are respectively associated to parameter names subject and object. The call to the system-defined procedure notequal tests whether the ground terms bound to variables $X and $Y are not identical. Table 3 also shows the ECA rule partyInvitation that notifies the good news to all cousins of having a new member. Table 3. A family relations document in the XML intensional database. brotherof subject="$X" object="$Y"  notequal subject="$X" object="$Y" / fatherof father $Z/fatherson $X/son/fatherof fatherof father $Z/fatherson $Y/son/fatherof /brotherof cousinof subject="$X" object="$Y"  fatherof father $Z1/fatherson $X/son/fatherof fatherof father $Z2/fatherson $Y/son/fatherof brotherof subject="$Z1" object="$Z2" / /cousinof rule name="partyInvitation" on insert  fatherof father $X/fatherson $Y/son/fatherof /insert/on if cousinof subject="$Y" object="$Z" //if do insert  invitation host $Y/hostguest $Z/guest/invitation /insert/do/rule

3.1

Rule Activation

A rule is activated when a new document is inserted. Table 4 shows the new document that has the structure defined by the rule’s event section. An alerter agent notifies the system that an insertion event occurs. The alerter sends the document content to the ADM rule manager to select an appropriate rule.

ADM: An Active Deductive XML Database System

143

Table 4. Inserted XML document. ··· fatherof father tom/father son tim/son/fatherof

3.2

Rule Response

After ADM receives the document involved in the event from the alerter, it selects a rule and checks the condition. If condition holds, the rule is executed. Table 5. Document generated by rule partyInvitation. ··· invitation host tim/host guest doe/guest/invitation

In the rule, the event section uses a document template to retrieve the required information. The logical variables $X and $Y are used to extract the pieces of information that match their position with the event document. A substitution that include the bindings $X = "tom" and $Y = "tim" is applied to the rule to obtain an instance of the rule. Then, a solution $Z = "doe" of the query cousinof subject="$Y" object="$Z" / is deduced. As the query succeeds, the action of inserting the document shown in Table 5 is performed.

4 4.1

Operational Semantics Basic Definitions

In this section a few basic definitions for XML terms are given to formally introduce the operational semantics of ADM. The XML term a a1 = T1 · · · am = Tm  · · · /a is normalized if its list of attribute-value pairs is lexicographically ordered by the name of the attribute in increasing order: ai ≤ aj if i ≤ j, i, j ∈ {1, . . . , m}. The set of substitutions Σ consists of the partial functions XMLVariable → XMLTerm defined recursively in Table 6. Computed answers to queries are obtained by the composition of substitutions. The null substitution  is the identity for substitution composition σ = σ = σ. A ground substitution does not introduce any variable. The natural extension σ : XMLTerm → XMLTerm of a substitution defined over variables to XML terms is denoted by the same name. The instance of an XML term x under substitution σ is denoted by xσ and its inductive definition on the structure of the XML term is given in Table 6. Instances of XML terms under ground substitutions are called ground instances. An unifier σ for the XML terms T and S is a substitution that produces syntactically identical instances Tσ = Sσ. The most general unifier (mgu) is an

144

O. Olmedo-Aguirre et al. Table 6. Variable substitution on XML terms.

a a1 = T1 · · · am

xσ = x x ∈ XMLString ∪ XMLText xσ = σ(x) x ∈ XMLVariable = Tm /σ = a a1 = T1 σ · · · am = Tm σ /

a a1 = T1 · · · am = Tm  a a1 = T1 σ · · · am = Tm σ ···T··· σ = · · · Tσ · · · /a /a

unifier that cannot be obtained as the composition of any other. The unification algorithm, shown in Table 7, essentially corresponds to that introduced by Martelli-Montanari [7], producing either a most general unifier if it exists or a failure otherwise. Table 7. Unification of XML terms. a a1 = S1 . . . am = Sm  A = E1 · · · En /a

a a1 = T1 . . . am = Tm  B = F1 · · · Fn /a

(C ∪ {A=B}, σ)  (C ∪ {S1 = T1 , . . . , Sm = Tm , E1 = F1 , . . . , En = Fn }, σ) A = a a1 = S1 . . . am = Sm / B = a a1 = T1 . . . am = Tm / (C ∪ {A=B}, σ)  (C ∪ {S1 = T1 , . . . , Sm = Tm }, σ) (C ∪ {T = T}, σ) (C ∪ {T = x}, σ) (C ∪ {x = T}, σ) (C ∪ {x = T}, σ)

   

(C, σ) (C ∪ {x = T}, σ) (Cσ, σ{x → T}) x ∈ / vars(T) failure x ∈ vars(T)

({T = S}, ) ∗ (, σ) mgu(T, S) = σ

The most general unifier is defined as the reflexive and transitive closure of the binary relation  ⊆ (XMLTerm × Σ) × (XMLTerm × Σ). Unification of XML terms provides an uniform mechanism for parameter passing, construction and selection of the information contained in the XML document. 4.2

Deductive Database Model

The meaning of a logical procedure is given by the interpretation of an SLDresolution1 step of a procedure call [5]. Conversely, the declarative reading of a logical procedure establishes the validity of the call to the procedure head whenever all the conditions forming the procedure body are valid. An SLD resolution step of the logical procedure b b1 = T1 · · · bn = Tn B1 · · · Bm /b in the query query A1 · · · An /query is obtained by replacing an instance of the body 1

Linear resolution for Definite programs with Selection rule

ADM: An Active Deductive XML Database System

145

B1 σ  · · · Bm σ  under the most general unifier σ of the first call A1 in the query and the head b b1 = T1 · · · bn = Tn / of the procedure. Table 8. SLD inference relation. b b1 = T1 · · · bn = Tn B1 · · · Bm /b ∈ XMLProgram ∃σ  . σ  = mgu(head(A1 ), b b1 = T1 · · · bn = Tn ) (query A1 A2 · · · An /query, σ) ⇒ (query B1 σ  · · · Bm σ  A2 σ  · · · An σ  /query, σσ  )

An SLD resolution step is thus defined as the relation ⇒⊆ (XMLTerm × Σ) × (XMLTerm × Σ) that transforms the query (query A1 A2 · · · An /query, σ) into the query (query B1 σ  · · · Bm σ  A2 σ  · · · An σ  /query, σσ  ) by replacing the procedure call A1 σ  by the instance of the procedure body B1 σ  · · · Bm σ  under σ  . Table 9. Correctness of SLD inference. (query C1 · · · Ck /query, ) →∗ (query  /query, σ) |=σ query C1 · · · Ck /query

The computed answer of a query consists of zero, one or more SLD resolution steps beginning from the initial query. The query execution terminates when there are no more goals to be solved. The computed answer is obtained from the composition of the substitutions used in each resolution step. Table 6 states that the computed answer is indeed the correct answer, i.e. the substitution that is the solution of the query. A well known result from logic programming [5] asserts that a computed answer σ obtained by the SLD resolution calculus is a model for both the program P and the query query C1 · · · Ck /query. Therefore, the computed answers are those that satisfy the logical constraints of the program. 4.3

Active Database Model

ECA rules integrate the event-directed rule processing into the knowledgebased model of deductive databases. The ECA rule model closely follows the perception-deduction-action cycle of multiagent-based systems. The operational semantics of rule execution, given in Table 10, defines the reduction relation −→⊆ (M XMLTerm × Σ) × (M XMLTerms × Σ ∪ M XMLTerms) where M XMLTerm denotes multisets of XML terms, including logical procedures and ECA rules. After observing event E, if there is a computed answer σ  for both the extended program with the event P ∪ {E} and the condition query C1 · · · Ck /query, then the collection of documents in both the extensional and intensional databases are modified by deleting the instance of documents B1 σ  , . . . , Bn σ  and inserting the instance of documents A1 σ  , . . . , Am σ  under σ  . The difference and union ⊕ operators for multisets of documents are used instead of the corresponding operators over sets due to the importance of documents multiplicity. The order in which documents are deleted or inserted is

146

O. Olmedo-Aguirre et al. Table 10. Reduction relation of active rules. rule  on E/on if C1 · · · Ck /if do  delete B1 · · · Bm /delete insert A1 · · · An /insert /do /rule

∈P

∃σ  ∈ Σ.P ∪ {E} |=σ query C1 · · · Ck /query (P, σ) −→ (P {B1 σ  , . . . , Bn σ  } ⊕ {A1 σ  , . . . , Am σ  }, σσ  )

not specified. By using instance of documents under substitutions, the semantics captures the propagation of values among logical variables across documents.

Table 11. Termination condition. rule  on E/on if C1 · · · Ck /if ∈ P do  · · · /do /rule ¬∃σ  ∈ Σ.P ∪ {E} |=σ query C1 · · · Ck /query (P, σ) −→ P

The operational semantics for the termination condition of the execution of a rule is given in Table 11. After observing the event E that triggers a rule, the rule does not execute if the current contents of the XML databases does not entail the rule condition. In this case, the reduction relation −→ leads to the program singleton P . Though ECA rules are powerful, an unpredictable behavior may arise since the rules may interfere with each other preventing their execution. In order to reduce the non-determinism in the ordering of actions, the sequential and parallel composition of rules are introduced to schedule their execution. The operational semantics of sequential composition is given in Table 12. The first rule describes the termination condition of two rules under sequential composition. If rule P terminates at state σ, then sequential composition of rules P and Q behaves like Q at the same state σ. The second rule describes the progress condition of two rules under sequential composition. If at state σ rule P reduces to rule P  , then the sequential composition of P and Q reduces to the sequential composition of P  and Q.

ADM: An Active Deductive XML Database System

147

Table 12. Reduction relation for the sequential composition of rules. (P, σ) −→ P (seq P Q/seq, σ) −→ (Q, σ) (P, σ) −→ (P  , σ  ) (seq P Q/seq, σ) −→ (seq P  Q/seq, σσ  )

Table 13 shows the rules of parallel composition of programs. The function docset : M XMLTerm×Σ → M XMLTerms gets the multiset consisting of document instances (possibly including logical variables) under a given substitution. The first two rules apply only if they share a non-empty multiset of documents. In that case, either rule P or Q may develop its behavior but not simultaneously. Table 13. Reduction relation for parallel composition of rules. (P, σ) −→ (P  , σ  ) docset(P, σ) ∩ docset(Q, σ  ) = ∅ (par P Q/par, σ) −→ (par P  Q/par, σσ  ) (Q, σ) −→ (Q , σ  ) docset(P, σ) ∩ docset(Q, σ  ) = ∅ (par P Q/par, σ) −→ (par P Q /par, σσ  ) (P, σ) −→ (P  , σ1 ) (Q, σ) −→ (Q , σ2 ) docset(P, σ) ∩ docset(Q, σ) = ∅ (par P Q/par, σ) −→ (par P  Q /par, σσ1 σ2 ) (P, σ) −→ P (Q, σ) −→ Q (par P Q/par, σ) −→ P ∪ Q

Third rule for parallel composition applies only if rules P and Q do not share a collection of documents. In that case, if both rules independently exhibit some progress, then under parallel composition they do not interfere. Finally, last rule describe the termination condition under parallel composition of rules. If two programs P and Q terminate at state σ, the they also terminate under parallel composition at the same state.

5

Conclusions

A simple and uniform model for active and deductive XML databases has been proposed in this paper. The ADM language extends the XML language by introducing logical variables, logical procedures and ECA rules. An experimental distributed system with layered architecture has been implemented that maintains the openness of the XML data by keeping apart the language extensions. As future work, we plan to develop analysis methods and tools to face the most difficult aspects of controlling and predicting rule execution.

148

O. Olmedo-Aguirre et al.

References 1. J. Bailey, A. Poulovassilis, P. T. Wood, Analysis and optimization of eventcondition-action rules on XML. Computer Networks, 39:239-259. 2002. 2. N. W. Paton, O. Diaz, Active database systems. ACM Computing Surveys, 31(1): 64-103. 1999. 3. M. Bambara, N. Tamura, Translating a linear logic programming language into Java. In Proceedings of ICLP’99 Workshop on Parallelism and Implementation Technology for (Constraint) Logic Programming Languages. pp. 19-39, Dec 1999. 4. The DARPA Agent Markup Language Homepage, http://www.daml.org/ 5. J. W. Lloyd, Foundations of Logic Programming. Springer-Verlag, 1987. 6. M. Liu, G. Dobbie, T. W. Ling, A logical foundation for deductive object-oriented databases, ACM Transactions on Database Systems, 27(1), pp. 117-151. 2002 7. A. Martelli, U. Montanari, An efficient unification algorithm, ACM Transactions on Programming Language and Systems, 4(2): 258-282, April, 1982.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.