Propos: A Dynamic Web Tool for Managing Possibilistic and Probabilistic Temporal Constraint Networks

Share Embed


Descrição do Produto

Propos: A Dynamic Web Tool for Managing Possibilistic and Probabilistic Temporal Constraint Networks Francisco Guil1 , Ivan Gomez1 , Jose M. Juarez2 , and Roque Marin2 1

2

Departamento de Lenguajes y Computacion Universidad de Almeria 04120 Almeria [email protected], [email protected] Dept. Ingenieria de la Informacion y las Comunicaciones Universidad de Murcia 30071 Espinardo (Murcia) {jmjuarez, roque}@dif.um.es

Abstract. In this paper we present Propos, a novel dynamic Web tool for managing probabilistic and possibilistic temporal constraint networks. These networks are a special sort of temporal constraint satisfaction problems, useful for representing and reasoning with uncertain temporal relations between temporal points. They can be applied as an effective formalism in a very different kind of domains. Propos has been developed using dynamic Web technology for mainly, make easier the complex process of interpreting, evaluating, and finally, extracting useful knowledge from the network. It can be viewed as a novel knowledge acquisition tool and, for practical purposes, it will be used for work with temporal patterns extracted from temporal data mining processes.

1

Introduction

Representing and reasoning about time is an important area of Artificial Intelligent (AI). Temporal formalism are applied in a lot of different tasks, like temporal planning, diagnosis or prognosis, realized in different dynamic domains like Web, agriculture or medicine. Temporal points and intervals are the main ontological primitives used by temporal formalism. In this paper, we are centered in models dealing with temporal points-based events. Many temporal representation and reasoning approaches assume that precise and certain temporal information is available, and very few works give support for situations in which we are dealing with imperfect temporal information. Uncertainty is one kind of imperfect information, which arises from the lack of information about the state of the world. As we can see in [2], the topic of handling uncertainty in temporal knowledge was considered as one of the major, newly, and growing area of temporal representation and reasoning. Dealing with this topic, we want to highlight two different works which proposed models for representing and reasoning with uncertain temporal relations. The first ´ J. Mira and J.R. Alvarez (Eds.): IWINAC 2007, Part II, LNCS 4528, pp. 551–560, 2007. c Springer-Verlag Berlin Heidelberg 2007 

552

F. Guil et al.

one, proposed by V. Ryabov and S. Puuronen [13], defined a model where the probabilistic approach is used to deal with uncertain temporal relations between temporal points. Based in this work, in [9], A. HadjAli, D. Dubois, and H. Prade presented the same model but with the proposal of using the possibility theory as a mathematical tool for dealing with uncertain temporal relations. In the works, the authors specified the way in which temporal information can be represented and the way in which we can reason with them. Both, the probabilistic and the possibilistic model belong to a special class of model called binary temporal constraint networks [3]. Temporal constraint satisfaction is an information technology useful for representing and answering queries about the times of events and the temporal relations about them. In our case, information is represented as temporal constraint network, and constraints represent the uncertain temporal relations between temporal points. Usually, the expert build the networks practically starting from scratch. They combine binary pieces of information obtaining a global model which represents the knowledge about the domain. In our case, we proposed a method for building this kind of model using temporal data mining techniques [8]. In particular, the main goal of the proposed method is to obtain an enumeration of temporal constraint networks that better summarizes the temporal information existing in the dataset. In this work we present a dynamic Web-based tool for representing and managing the mined temporal constraint networks (probabilistic and possibilistic). This tool will assist the expert in the knowledge discovery process, in particular, in the visualization, interpretation, and evaluation of the patterns extracted from temporal data mining processes, tasks necessaries for complete the main goal, discover useful knowledge. The remainder of the paper is organized as follows. Section 2 describes briefly our proposal for temporal data mining and its place in the global Knowledge Discovery in Databases Process. The Section 3 describes the Probabilistic and Possibilistic Temporal Constraint Networks Model. Section 4 presents the Propos tool and shows the basic functionality via a practical example. Conclusions and future work are finally drawn in Section 5.

2

Temporal Data Mining

Data mining is an essential step in the process of knowledge discovery in databases that consists of applying data analysis and discovery algorithms that produce a particular enumeration of structures over the data [5]. There are two types of structures: models and patterns. So, we can talk about local and global methods in data mining [11]. Since the problem of mining association rules was introduced by Agrawal in [1], a large amount of work has been done in several directions, including improvement of the Apriori algorithm, mining generalized, multi-level, or quantitative association rules, mining weighted association rules, fuzzy association rules mining, constraint-based rule mining, efficient long patterns mining, maintenance of

Propos: A Dynamic Web Tool

553

the discovered association rules, etc. Temporal data mining can be viewed as an extension of this work. Temporal data mining is an important extension of data mining techniques. Temporal data mining can be defined as the activity of looking for interesting correlations (or patterns) in large sets of temporal data accumulated for other purposes. It has the capability of mining activity, inferring associations of contextual and temporal proximity, that could also indicate a cause-effect association. This important kind of knowledge can be overlooked when the temporal component is ignored or treated as a simple numeric attribute [12]. In non-temporal data mining techniques, there are usually two different tasks: the description of the characteristics of the database (or analysis of the data), and the prediction of the evolution of the population. However, in temporal data mining this distinction is less appropriate, because the evolution of the population is already incorporated in the temporal properties of the analyzed data. We can found in the literature a large quantity of temporal data mining techniques. We want to highlight some of the most representative ones. So, we can talk about sequential pattern mining, episodes in event sequences, temporal association rules mining, discovering calendar-based temporal association rules, patterns with multiple granularities mining, and cyclic association rules mining. However, there is an important form of temporal associations which are useful but could not be discovered with this techniques. These are the inter-transaction associations presented in [10]. In [6] we presented an algorithm, named T SET , based on the inter-transactional framework for mining frequent sequences from several kind of datasets, mainly transactional and relational datasets. The improvement of the proposed solution was the use of a unique structure to store all frequent sequences. The data structure used is the well-known set-enumeration tree, widely used in the data mining area, in which the temporal semantic is incorporated. The result is a set of frequent sequences describing partially the dataset. This set forms a potential base of temporal information that, after the experts analysis, can be very useful to obtain valuable knowledge. However, the overwhelming number of discovered frequent sequences may make such task absolutely impossible in practice. This problem can be viewed as a second-order data mining problem, which consists in the necessity of obtaining a more understandable and useful sort of knowledge from a huge volume of temporal associations resulting after the data mining process. In [8], we proposed a method for building a special model of temporal network formed by a set of uncertain relations amongst temporal points. The temporal model, proposed by HadjAli, Dubois and Prade in [9], is based on the Possibility Theory as expressive tool for the representation and management of uncertainty in point-based temporal relations. The uncertainty is represented by a vector describing three possibility values, expressing the relative plausibility of the three basic relations between two temporal points, that is, ”before”, ”at the same time” and ”after”. Thus, the authors define the basic operations (inversion, composition, combination and negation) that allow to infer new temporal information and to propagate uncertainty in a possibilistic way.

554

F. Guil et al.

Fig. 1. The Knowledge Discovery Process

Another related work is presented in [13], where the authors proposed a model for representing and reasoning with uncertain temporal relations using the probabilistic approach to deal with uncertainty. The building of this kind of networks from databases is currently in the experimentation step. As we can see in Figure 1, the next step in the global Knowledge Discovery in Databases is the interpretation/visualization and evaluation of the patterns extracted in the temporal data mining process. In this case, the pattern is a temporal constraint network, and it can be viewed as a graph abstract data type, a very difficult structure to deal with. Moreover, the operations for reasoning involves very complex calculus. A simple addition implies a lot of operations for reconstructing the network. Another problem we find in this step is the understanding about how the temporal model works in order to infer new knowledge. The necessity of a tool to deal with this type of patterns is clear. In the next section we describe deeper the sort of temporal pattern we mined using temporal data mining techniques.

3

The Temporal Patterns Model

Both models, the Possibilistic Temporal Constraint Networks [9] and the Probabilistic Temporal Constraint Networks [13], belong to a special sort of binary temporal constraint networks used to represent and to infer (to reason) with uncertain temporal relations between point-based events. The main difference between them is the approach to deal with temporal uncertainty. In the former, the authors proposed the Possibility Theory [4], whereas in the later, the authors proposed the well-know probabilistic approach. The possibilistic model appeared in the literature as an improvement of the probabilistic one in the sense that classical probability theory can not model ignorance (even partial ignorance) in a natural way. In [4], the authors show how the possibility theory copes with the situation of complete ignorance in a non-biased way, that is, without making any prior assumption. However, in this paper we do not want to argue what of the two approaches is better to deal with temporal uncertainty. For us, both models are designed with different applicability, depending of the problem we want to

Propos: A Dynamic Web Tool

555

deal with. It will be the responsibility of the user to choose the appropriate theory for a particular problem. Independently of the chosen approach, both models (in the sequel we call them Temporal Patterns Model) belong to the general temporal constraint networks model, and therefore, can be modeled as a tuple < X, D, C >, where X is a set of variables, D the set of variable domains and C the set of constraints restricting the values that the variables can simultaneously take. In particular, the variables represent point-based events, and the constraints represent uncertain relations between temporal points. The constraint is represented as a vector with three possibility/probability values denoting the plausibility/probability of the three basic relations: ”” (after). Given this three basic relations, an uncertain relation between two temporal points is expressed as any possible disjunction of these basic relations: ≤ ⇐⇒ < or = ≥ ⇐⇒ > or = = ⇐⇒ < or > ? ⇐⇒ The last case represents total ignorance, that is, any of the three basic relations is possible. In the probabilistic model, this case is impossible to represent. Let a and b two temporal points, and rab the uncertain relation between them. In the probabilistic model, < = > rab = (Pab , Pab , Pab ) < > < > = = + Pab + Pab = 1.0, where Pab (respectively Pab , Pab ) is the probability with Pab of a < b (respectively a = b, a > b). In the possibilistic model, < = > , Πab , Πab ) rab = (Πab < > < > = = with Πab +Πab +Πab = 1.0, where Πab (respectively Πab , Πab ) is the possibility of a < b (respectively a = b, a > b). Here, and using the duality between possibility and necessity, namely N (A) = 1 − Π(Ac ), where Ac is the complement of A, we can derive the possibility and necessity degrees of each basic relations and their disjunctions. Both models complete the proposal with the reasoning mechanism that, as we can see in the Figure 2, includes inversion, composition, addition, and negations operations. This operations define the rules that enable us to infer new temporal information and to propagate uncertainty in a probabilistic/possibilistic way. These rules complete the definition of a model for representing and reasoning with temporal uncertainty, handled in the settings of point algebra.

inversion composition combination negation

⇐⇒ ⇐⇒ ⇐⇒ ⇐⇒

rab = rba rac = rab ⊗ rbc rab = r1ab ⊕ r2ab ¬

Fig. 2. Operations for reasoning with uncertain temporal relations

556

4

F. Guil et al.

The Propos Tool

In this section, we present P ropos, the Web dynamic tool designed for managing temporal patterns. In order to show both the interface and the functionality, we will follow an example obtained after a temporal data mining process over Diagnostic Evolution Database. The example is a very representative temporal possibilistic pattern extracted in practical experience in an Intensive Care Unit (ICU) [7]. The pattern describes patients to whom physicians diagnosed at income day: d6 (Acute Miocardial Infarction of the Pared Inferior), d7 (haemorrage complica tions), and d169 (Acute Miocardial Infarction). Also, d169 is diagnosed again the third day of stay in the ICU. The uncertain relation that the temporal mining process extract are: Πd6 ,d7 = (0.44, 1, 0.44), Πd7 ,d169 = (0.7, 1, 0.7),

Πd6 ,d169 = (0.7, 1, 0.7) Πd169 ,d

169

= (1, 0.87, 0.87)

In ICU domains, as well as the final diagnosis (like other hospital services), there are evolutive diagnoses that state the diagnostic hypotheses. These hypotheses are daily made by physicians during patient’s stay at the ICU service. Furthermore, they can be considered high-level medical information since it is obtained from physician’s knowledge and medical observations (like EKGs, tests, or nursing care data). Despite the importance of other clinical information within the health record, such as treatments or demographic data, we consider in our experiment that the evolution of these diagnosis are a good representation of patient problems and the discovery of temporal pattern diagnosis could be useful in many AI systems for temporal diagnosis or prognosis. In our experiment, each patient was represented in the database by a temporal sequence of diagnoses (temporal points) and the data mining process results were frequent temporal patterns (or frequent sequences) of diagnosis evolution. In the analysis of this data, different parameters had been empirically stated (maxspan = 24 , and support value = 3, 5, 9) depending of the dataset of 144 patients. Let us see how Propos works. Once P ropos is started, and after the authentication process, we can create a new possibilistic project using the dymanic forms that we can see in Figure 3. With the project loaded, we can start to work within the network. First we have to accomplish is to create all the temporal points (nodes) of the network. By clicking ”Add Node” in the control panel we will get a new temporal point in the desktop. The node properties can be modified opening the ”Node Properties Inspector” clicking twice on the node. Once we got the nodes created, it’s time to connect them creating relations and bringing life to the network. For creating a new relation, first we have to select, by clicking once on a node, two nodes and then click the ”add link” button to establish the relation. After create the relation, we have to modify the possibility vector values, and it is done with the ”Link Properties” inspector. In Figure 4 we can see the global network formed by binary pieces of temporal information.

Propos: A Dynamic Web Tool

557

Fig. 3. Loading a project

Fig. 4. The complete network

After the network has been created, we can use the tool as a Knowledge Acquisition tool. We can interact with it (combining two or more relations, adding new uncertain relations to the network, and so on) and we can obtain

558

F. Guil et al.

Fig. 5. Queries form

Fig. 6. Inversion operation form

new knowledge from it by opening the ”Queries Form”, located in the ”Graph Control Panel” and represented by the symbol ”?”. When we open this form, the network is recombined in order to obtain its minimal representation, where every node is connected with the others. We can ask for two nodes and obtain

Propos: A Dynamic Web Tool

559

the possibility values associated. We can see in the Figure 5 the ”Queries Form” asking for the possibility degrees between the nodes d6 and d169 . In this case, rd6 ,d169 = (0.87, 0.87, 1). When editing networks we can use the ”Reasoning Operators” to avoid calculate possibility vectors when we get new information. In Figure 6 we can see ”Inversion Operator Form”:

5

Conclusions and Future Work

In this paper, we have presented a novel tool for representing and managing temporal constraint networks formed by uncertain temporal relations between point-based events. The tool allows to represent the uncertainty from two different point of view, the possibilistic and the probabilistic one. We propose the use od dynamic Web technology in order to obtain better mechanisms for representing the networks and understanding how they work they infer new knowledge. The Web-based technology will also allow us to use the tool from very different areas of the world. As future work, we propose first to integrate the tool with an existing information system designed for clinical guidelines managing. The basic idea is to use Propos as an assistant for representing and reasoning with mined temporal patterns in a visual way. Secondly, we propose to extend the tool for representing uncertain temporal relations between temporal intervals.

Acknowledgments This work is supported in part by MEC grant TIN2006-15460-C04-01 and MEC grant TIN2004-05694.

References 1. R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In P. Buneman and S. Jajodia, editors, Proc. of the ACM SIGMOD Int. Conf. on Management of Data, Washington, D.C., May 26-28, 1993, pages 207–216. ACM Press, 1993. 2. L. Chittaro and A. Montanari. Temporal representation and reasoning in artifical intelligence: Issues and approaches. In Annals of Mathematics and Artificial Intelligence, number 28, pages 47–106. IEEE Computer Society, 2000. 3. R. Dechter, I. Meiri, and J. Pearl. Temporal constraint networks. Artificial Intelligence, 1-3(49):61–95, 1991. 4. D. Dubois and H. Prade. Possibility Theory. Plenum Press, 1988. 5. U. Fayyad, G. Piatetky-Shapiro, and P. Smyth. From data mining to knowledge discovery in databases. AIMagazine, 17(3):37–54, 1996. 6. F. Guil, A. Bosch, and R. Mar´ın. TSET: An algorithm for mining frequent temporal patterns. In Proc. of the First Int. Workshop on Knowledge Discovery in Data Streams, in conjunction with ECML/PKDD 2004, pages 65–74, 2004.

560

F. Guil et al.

7. F. Guil, J. M. Ju´ arez, and R. Mar´ın. Mining possibilistic temporal constraint networks: A case study in diagnostic evolution at intensive care units. In Intelligen Data Anlisis in Biomedicine and Pharmacology (IDAMAP 2006), 2006. 8. F. Guil and R. Mar´ın. Extracting uncertain temporal relations from mined frequent sequences. In Proc. of the 13th Int. Symposium on Temporal Representation and Reasoning (TIME 2006), pages 152–159, 2006. 9. A. HadjAli, D. Dubois, and H. Prade. A possibility theory-based approach for handling of uncertain relations between temporal points. In 11th International Symposium on Temporal Representation and Reasoning (TIME 2004), pages 36– 43. IEEE Computer Society, 2004. 10. H. Lu, L. Feng, and J. Han. Beyond intra-transaction association analysis: Mining multi-dimensional inter-transaction association rules. ACM Transactions on Information Systems (TOIS), 18(4):423–454, 2000. 11. H. Mannila. Local and global methods in data mining: Basic techniques and open problems. In P. Widmayer, F. Triguero, R. Morales, M. Hennessey, S. Eidenbenz, and R. Conejo, editors, In Proc. of the 29th Int. Colloquium on Automata, Languages and Programming (ICALP 2002), Malaga, Spain, July 8-13, 2002, volume 2380 of Lecture Notes in Computer Science, pages 57–68. Springer, 2002. 12. J. F. Roddick and M. Spiliopoulou. A survey of temporal knowledge discovery paradigms and methods. IEEE Transactions on Knowledge and Data Engineering, 14(4):750–767, 2002. 13. V. Ryabov and S. Puuronen. Probabilistic reasoning about uncertain relations between temporal points. In 8th International Symposium on Temporal Representation and Reasoning (TIME 2001), pages 1530–1511. IEEE Computer Society, 2001.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.