Semantic Knowledge Transfer Agent

June 5, 2017 | Autor: Udara Samaratunge | Categoria: Software Engineering, Augmented Reality, Semantic Web, Enterprise application Development
Share Embed


Descrição do Produto

Semantic Knowledge Transfer Agent I.M.H.J. Bandara , N.B.A. Dahanayake , S.A.U.S. Samaratunge , C.A. Karunanayake, J.A.C.N. Jayasinghe and N.Kodagoda

Department of Information Technology; Sri Lanka Institute of Information Technology [email protected], [email protected], [email protected], [email protected], [email protected]

Semantic Knowledge Transfer Agent (SEKTA) is a software system which uses natural language text to create ontologies. It also provides an efficient and simple question answering facility to query the created ontologies. SEKTA basically consists of two sub systems which are: Ontology Editor and the Question Answering System. Ontology Editor is an IDE backed by protégé framework which provides means for the user to create an owl file by entering a related natural language (English) description. It will create an owl file in a specified destination by applying standard owl principles and axioms. It also has the option to import an owl file to the repository. The Question Answering system uses the created ontologies to answer a natural language query raised by the user. It can search multiple owls in a single domain to provide an accurate answer. It is also capable of explaining a particular entity/resource in the given domain as well as showing details on how the answer is generated by providing a list of classes, individuals and other resources in the selected ontology. Index Terms - SEKTA, Ontology, Owl, Question Answering, IDE Ontology modeling is very hard. The person who creates the ontologies should not only be a domain expert and also should be good in an ontology modeling language such as owl. Most software systems are based on existing ontologies rather creating new ontologies.

I. INTRODUCTION World Wide Web or WWW has made a rapid revolution in the world of information sharing. Since its origin as proposed by Tim Berners Lee, it is a very important step in recent history in the world of Information Technology. Although WWW has made radical change in information sharing still it has some severe limitations. Lack of consistency throughout the web is a one such problem. Since there is no relationship between similar information/resources same content can be given in contradictions which are very inconvenient for the users. When it comes to information sharing this will become much bigger problem. Semantic Web concept was proposed by the same Tim Berners Lee in 2001 with the intention of overcoming the above problem. Although Semantic Web is still in its infancy many applications have been built to full fill its intended purpose. Many of such applications are focused on a concept called ontology. An ontology is a formal definition of set of concepts formed in a particular domain. It also defines the relationships between those concepts. This is an important concept in Semantic Web when representing knowledge or information which is this paper is more focused with. In practice this ontology based knowledge sharing suffers from several problems. 1.

It is hard to model an ontology from scratch.

2.

Acquiring domain knowledge is difficult to model an ontology. As described in above problem to ontology modeler/engineer must have a solid knowledge about the domain that he/she is going to model. Another option is to get the domain expertise from a third party but then there is another problem arises in coordination between ontology engineer and domain expert. If not intended purpose of the ontology may not be achieved.

3.

No system is built to create an ontology from natural language so it can model real world. Many existing systems create ontologies from a very structured format such as from xml or from a database schema. When modeling ontologies in that way may not be a perfect fit for the intended purpose. Because such high level sources are not flexible enough to cater real world needs.

No such system is ever developed to model ontologies from natural language.

II. METHODOLOGY 4.

No specific method to share that knowledge stored as owl files. Although this seems an out of the scope, in practice sharing the knowledge gathered as ontologies is also very important. It is the sole purpose of creating ontologies amid many difficulties. Several systems are proposed mainly as question answering systems to address this issue. But most of them are not directly interacting with ontologies to provide an answer.

SEKTA is a system developed to answer above problems by providing an approach to create ontologies and query them within the same system. It is actually two systems within one. It will use the output of one system as the input to the other system. SEKTA is built to cater two types of users: domain experts and normal users. SEKTA will enable domain specific users to create ontologies by entering natural language text. This is done through an IDE. Every step has been taken to make this process as simple as possible so a domain expert who has a little knowledge of ontologies can also efficiently use the system. It also capable of updating existing ontologies as well. In the other side the Question Answering System will work as a web based system through GUIs. This is built in focus of normal users who doesn’t have any understanding over the underlying functionality including how ontologies are built and how they work. But domain specific users can still work with this system also as there are special functionalities are provided with QA system that can be useful to them when evaluating generated answers. System is designed in a way that it works more closely with the users who use it. Since ontology generation is an iterative process, system will interact with users all the time before creating the Owl file. Such interaction is given in the QA system as well. SEKTA is entirely built using Java related technologies and all the third party tools and components used are open source.

SEKTA broadly comprises with two components which work on their own.

A. Ontology Editor As its name implies Ontology Editor is the tool used to create ontologies within SEKTA. It is built into an existing Integrated Development Environment (IDE). This base IDE is the protégé IDE which is very popular in Ontology development community. There are several reasons to use this IDE and one of them is that creating ontology is not a single click kind of an action and it needs lot of user interaction. And the other reason is protégé has the underlying framework that supports processing owl syntax. In the editor the domain specific user can enter information sentence by sentence. Such sequence of natural language text is called a Snippet. Snippet editor will take the user inputs terminated by a full stop as one snippet.

Figure 1: IDE of Snippet

User can add that sentence by pressing the relevant button. This is a preliminary phase before creating the actual ontology. As given in the illustration user can update an existing ontology from this screen as well. The important thing behind this process is hidden to the user. Every syntactically correct sentence is taken

as a successful translation and parsed to the next phase. Before that they will be mapped to a relevant owl axiom. If translation is not successful still sentence will be parsed but will not be bound to owl rule.

Then the inputted sentences are sent to the parser engine through a web service called APE. Web service consists of two parts : parser engine and owl verbalizer. Parser will break the sentences to verbs, nouns, adverbs etc and verbalizer will apply the owl rules on them.

The next phase involve with another IDE called Text IDE. This is the place where actual ontology is modeled. The snippets parsed from snippet IDE will be displayed in this view. In this IDE user is not expected to do any changes and its sole purpose is to show the given contents before creating the owl. Simply pressing the update button will generate the ontology.

Figure 3: Ontology Creation Overview

These processed terms will be stored as Xml by the system. Verbalizer will compare each and every sentence to derive the sub classes and super classes by exploring their semantic relationships. Figure 2: IDE of Snippet Editor The limitation in this approach is user always has to enter text sentence by sentence since parser always accept text in that manner. And inserted sentences should be related to each other. There is a special field is given in the IDE for user to enter comments regarding the resources/entities given in the ontology. These comments are used in QA system for explanatory purposes.

In addition verbalizer is capable of discovering properties and individuals as well. All of the above components will be used to construct the final owl file.The overall process is illustrated in the above diagram.

C. Question Answering System This system will provide the knowledge extraction facility to the end user.

B. System Implementation and Functionality First the domain specific user has to feed the System with Controlled Natural Language which is English. Controlled Natural Languages (CNLs) are subsets of natural languages, obtained by restricting the grammar and vocabulary in order to reduce or eliminate ambiguity and complexity.

Figure 4: Question Answering System

It is designed as a normal web based system which can be easy to use and many end users are familiar with. This system has an administration panel also for domain specific users to log in. once they have logged in they will be able to upload ontologies to the repository. This is a feature available only for them other than asking questions. Normal users can simply type a question in the given space without logging into the system. This question is also a natural language query and system is capable of processing the question and identify to which pre defined category it belongs to. When the question is submitted system will identify the domain it belongs and search for the suitable ontology. This is a URI based search. Once it finds the suitable ontology from the repository system will return the particular resource of that ontology. The answer is presented in a meaningful way. Unlike other similar systems SEKTA’s QA system has the ability of describing a particular resource. System will use the comments inserted by the user who created the ontology from the IDE at design time. It also has a unique feature on explaining the answer. A special button is provided to show how the answer is generated from the system. Once the user clicks this button system will show the Super Classes, Sub Classes and Instances of the selected ontology. This feature is very useful to the domain specific users in evaluating the answer. Another important feature in this QA system is the WordNet 2.1 integration. Therefore system is capable of generating synonyms for more than 75000 words. When a question is submitted to the system it will find the matching related words for the specific term (given in the question) from the WordNet. This is very important in expanding the search space when searching in multiple Owl files. In case of no related term is returned from WordNet system will simply use the original term to do a key word search.

III. RESULTS AND DISCUSSIONS SEKTA was built successfully with some minimal changes to its initial proposed method. As it is mentioned in project proposal SEKTA will be a

system with two deliverables: Ontology Editor and QA system. The Ontology Editor which is basically an IDE having protégé framework. It represents an evolving field of Ontology related research. In the other hand the QA system is the real application which uses ontologies created from the editor. It is the product with the commercial value derived from this research.

CONCLUSION AND FUTURE WORK

IV.

Since SEKTA is having an active research component it can be enhanced with many ways. System is designed in a way that it can be expanded easily. 1.

Ontology Editor can be integrated to a web based GUIs by incorporating supporting framework.

2.

Can allow upload a text file containing the knowledge domain description rather providing sentence by sentence.

3.

Can integrate WordNet to Ontology Editor as well.

4.

Provide support for different Ontology modeling languages other than Owl. Ex: DAML/OIL, RDFS etc.

5.

Can enhance the searching by incorporating ranking algorithms.

6.

Expand the QA system to answer many types of questions including ambiguous questions.

REFERENCES [1] Grigoris Antoniou and Frank van Harmelen, A Semantic Web Primer, Second Edition, England: MIT Press, 2008 [2] Hirschman and R. Gaizauskas, “Natural Language Engineering: Natural language question answering: the view from here”, Vol.7, UK: Cambridge University Press, 2001, pp.275-300.

[3] Matthew Horridge , Simon Jupp, Georgina Moulton, Alan Rector, Robert Stevens and Chris Wroe, “A Practical Guide To Building OWL Ontologies Using Prot´eg´e 4 and COODE Tools”, The University of Manchester, 16th October 2007. [4] Matthew Horridge , Simon Jupp, Georgina Moulton, Alan Rector, Robert Stevens and Chris Wroe, “A Practical Guide To Building OWL Ontologies Using Prot´eg´e 4 and COODE Tools”, The University of Manchester, 16th October 2007. [5] Michael K. Smithand, Eric Miller and Deborah L. McGuinness, “OWL Web Ontology Language Guide”, W3C recommendation, 10 February 2004. [Online]. Available: http://www.w3.org/TR/owl-guide/ Accessed: 10th January 2009. [6] The trustees of Princeton University, “About WordNet”, 21 August 2009, Available : http://wordnet.princeton.edu/wordnet/ [7] Tim Berners Lee, James Hendler, Ora Lasilla “The Semantic Web” , May 2001, Available : http://www.scientificamerican.com/article.cf m?id=the-semantic-web [8] Aleksander Pivk, “Automatic ontology generation from web tabular structures”, AI Communication , [PDF]. Available: http://dis.ijs.si/Sandi/docs/AICOMMsubmitted.pdf. Accessed: 15th January 2009. [9] Graham Klyne, “Resource Description Framework (RDF): Concepts and Abstract Syntax”, W3C recommendation, 10 February 2004. [Online]. Available: http://www.w3.org/TR/2004/REC-rdfconcepts-20040210/. Accessed: 5th January 2009. [10] Vanessa Lopez, Michele Pasin, and Enrico Motta, “Aqua Log: An Ontology-Portable Question Answering System for the Semantic Web”, Knowledge Media Institute

The Open University UK, [PDF]. Available: http://eprints.aktors.org/449/01/eswc05_proc eedings-lopez.pdf. Accessed: 17th January 2009.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.