PathCase: pathways database system

Share Embed


Descrição do Produto

PathCase: Pathways Database System Z.M. Ozsoyoglu, J. Nadeau, G. Ozsoyoglu, A. Cakmak, B. Elliott, M. Kirac, G. Strnad, G. Yavas

Department of Electrical Engineering and Computer Science Case Western Reserve University Cofactor in

INTRODUCTION

Substrate

Living organisms behave as complex systems that are flexible and adaptive to their surroundings. At the molecular level, organisms consist of intricate networks of molecular reactions, which are often called “biochemical pathways”. In order to maintain, visualize, and ultimately, analyze organism functions that result from biochemical pathways, the PathCase is being developed. The system contains a pathways database and the associated tools to store, compare, query, and visualize biochemical pathways through a web client. The aim is to develop an integrated database, and the associated tools to support computational analysis and visualization of biochemical pathways. The ultimate goal of the system is to describe, utilize and predict systems functions and behaviors of living organisms.

Regulator

Activator

Process

Product

Inhibitor

Cofactor out

Save Graph as Image File

Layout and Graph Display Options

OVERVIEW Pathways are the sequential and cumulative action of genetically distinct, but functionally related molecules. Each reaction in a pathway is a biochemical step from specific substrates (input molecules) to products (output molecules) that are chemically modified substrates. Each step may also use various combinations of molecules as cofactors, activators, inhibitors, and regulators, and usually involves at least one genetically unique gene product that catalyzes the reaction step. Pathways, in general, illustrate the functional relations between molecules. Functional annotations include, for example, the identity of the substrate(s), product(s), cofactors, activators, inhibitors, enzymes or other processing molecules, RNA and protein expression patterns, reaction kinetics, and associated phenotypic variation and diseases. Ultimately, many other kinds of information (or knowledge) can be incorporated. Such information forms a rich research resource that integrates genomic and biological information which can be managed, analyzed, queried and displayed in dynamic ways at various levels of biological and genetic detail to provide insight into diverse biological processes in health and disease.

Organism selector Minimap, graph overview

PathCase is an integrated software system for storing, managing, analyzing, and querying biological pathways at different levels of genetic, molecular, biochemical and organismal detail. At the computational level, PathCase allows users to visualize pathways in multiple abstraction levels, and to pose predetermined as well as ad hoc queries using a graphical user interface. Pathways are represented as graphs, and stored in a relational database. The PathCase Web Client is an online, webbased interface to the PathCase Database, and provides a web-based toolset via a java applet loaded within a browser window. It is designed as an intuitive and easy-to-use tool, with no need to study user manuals. PathCase Web Client Includes:

Figure 5. Gene Viewer

BUILT-IN QUERIES In addition to AQI, PathCase also provides a set of commonly used built-in queries such as neighborhood and path queries. For instance, Figure 6 displays the results for the built-in query “Find the paths (if any) between L-galactonolactone and L-ascorbate”. Such queries are invoked from the Pathway Browser, or from the Pathway Viewer directly while exploring a pathway graph visualization. The results are presented in both tabular and graphical forms.

•Pathway Browser – a tree viewer to browse metabolic pathways at pathway, process and molecular entity levels •Pathway Viewer Applet – an online graphical tool to visualize and edit pathway graphs •Gene Viewer – an online graphical tool that maps pathway genes onto chromosomes. •Advanced Query Interface – a powerful interactive tree-structured query editor •Built-in Queries - a set of predefined queries on the pathways data •Ontology Viewer - a tool for querying and visualizing Gene Ontology and MeSH •Pathway Web Services – an XML web service for querying the pathway database and communicating with the PathCase server.

Figure 3. Pathway Viewer Applet

ADVANCED QUERY INTERFACE (AQI) Advanced Query Interface (AQI) provides a generic user interface for posing a large class of ad-hoc queries about pathways using tree-structured views. Entities involved in the pathways database are better viewed by generalization/specialization hierarchies, e.g., classes of pathways, super pathways, functional classifications in arbitrary granularities, etc. In addition, pathway components, i.e., processes in a pathway, and molecular entities involved in each process can naturally be viewed in the form of a tree (hierarchy). Based on this observation, AQI is a query interface which allows users to query the database using tree-structured views of the relational database. As an example, consider the treestructured (hierarchical) schema Pathways-Process-Molecular entity, where a pathway consists of several processes, and each process is associated with several molecular entities. The AQI dynamically generates SQL queries as the user interacts with the interface. Using AQI, users can build ad-hoc queries by dynamically adding new input fields in different orders, and link them in various ways. Initially, AQI interface does not contain any input fields as shown in Figure 4.a. Users can start by selecting any biological object (e.g., pathway, process, molecular entity, organism) as the primary focus of the query. From there on, additional filters and predicates can be created by adding further input fields. As an example, the query that is interactively constructed in AQI searches for pathways that include processes with cortisone as activator. The user can choose the fields that will be included in the output by highlighting the field names (e.g., pathway name is the only output field in Figure 4.b. In the additional example (Figure 4.c), the query finds the pathways that includes a process with name diamine transaminase or tryptophanase.

Figure 4.a Initial state of AQI

Figure 6. Example Built-in query

Pathway Browser

PathCase FEATURING KEGG DATA

Pathway Viewer Applet

PathCase is an open-source web-based software tool for metabolic pathways, and is designed to easily switch between multiple pathway databases. PathCase, now, features Kegg data corresponding to Kegg’s free ftp release as of Oct. 15, 2006. PathCase system with Kegg data is available at: http://dblab.case.edu/PathwaysKegg.

Pathway Related Built-in Queries

ONTOLOGY VIEWER

Figure 1. PathCase Web Client Overview

Ontology Viewer visualizes Gene Ontology (GO), so that relationships between terms within GO are easily viewed and multiple, distantly-related terms can be visualized concurrently. Basic statistics on individual GO terms and sub-trees are provided, such as the child terms, the number of PubMed articles containing this term, and GeneIDs and pathways associated with the term. The visualization system is implemented and available on the web, integrated with PathCase. A particular term can be visualized, with the interaction network since the root term.

PATHWAY BROWSER Pathway Browser provides a tree-structured view of the pathways available in the database. Pathways, processes, molecular entities and organisms can be browsed using this tool. Pathways are organized into categories with respect to their functionality. Details of the selection made using the Pathway Browser are given on the right hand side, with a set of advanced functionalities including visualizing and performing queries on the selection.

Figure 4.b AQI Example 1

Figure 7. Ontology Viewer

PATHWAY WEB SERVICES The Pathway Web Services provide many querying functions that are available over the Internet. This component has a web method for each query it implements. The web services can be accessed over SOAP (Simple Object Access Protocol) or HTTP. We have implemented over forty web methods. Querying functions are categorized into five types: molecular entity queries, reaction (process) queries, pathway queries, organism queries and path computations. Figure 2. PathCase Pathway and Organism Browser

PATHWAY VIEWER APPLET Pathway Viewer presently visualizes a single metabolic pathway or multiple metabolic pathways in an organism-specific or organism-independent manner, provides a large number of functionalities including collapsing-zooming, hiding-expanding, editing, and node property-setting. The tool supports ad-hoc querying of the pathways data as well as menu-driven queries. A process has no (i.e., a non-enzymatic process) or a set of catalyzing enzymes, a set of substrates and products, and a set of co-factor-ins, co-factor-outs, inhibitors, or activators. A sample process is drawn in the following figure, and the components are annotated. Even large pathways can easily be viewed using the mini map component of the Pathway Viewer Applet. This mini map shows a smaller version of the whole pathway, indicating where the current view area is located on the whole pathway. The organism specific version of a pathway can be viewed using the organism selector which displays the hierarchical structure organisms.

Acknowledgements: Figure 4.c AQI Example 2

GENE VIEWER GeneViewer component of PathCase allows users to view the genes that encode the enzymes of a given pathway. In order to specify the location of a gene on its chromosome (i.e., gene locus) various addressing schemes have been developed by geneticists. PathCase database currently accommodates three types of gene addresses: (i) molecular location, (ii) cytogenetic address, and (iii) genetic linkage distance. The GeneViewer by default uses the most precise address, molecular location, if available. If molecular location is not available for a gene in the PathCase database, cytogenetic address and genetic linkage distance is checked in the given order. Figure 5 shows the Folate Pathway genes for mouse where the gene serine hydroxymethyl transferase 1 is highlighted in red color.

This research is supported by the National Science Foundation (DBI 0218061) and a grant from the Charles B. Wang Foundation to the Center for Computational Genomics, CWRU. The equipment and the SQLServer database system used to develop the PathCase system are donated by Microsoft. The authors also acknowledge the contributions of Yu Mei, Murat Tasan, Scott Newman, Wanhong Xu, Nattakarn Ratprasartpron, Toshimori Kitami, Greg Schaefer, Lakshmi Krishnamurthy, Michael Starke, Marc Reynolds, Brandon Evans, and Fatih Akgul who worked on earlier stages of this project.

REFERENCES 1. 2. 3. 4. 5. 6. 7.

Case Pathways Database System: http://nashua.case.edu/pathways Z. Meral Özsoyoglu, Gultekin Özsoyoglu, Joseph Nadeau. Genomic Pathways Database and Biological Data Management. Animal Genetics; 37 Suppl 1:41-7, August 2006. Z. Meral Özsoyoglu, Joseph Nadeau, Gultekin Özsoyoglu, Murat Tasan. Towards an Integrated Software System for Biological Pathways. IEEE Data Eng. Bull. 27(4): 53-60, 2004 Krishnamurthy, L, Nadeau, J., Ozsoyoglu, Z.M., Ozsoyoglu, G., Schaeffer, G., Tasan, M., Xu, W., "Pathways Database System: an Integrated System for Biological Pathways", Journal of Bioinformatics, p. 930, vol. 19, (2003). Z.M. Ozsoyoglu, J. Nadeau, G. Ozsoyoglu, "Pathways Database System", OMICS: A Journal on Integrative Biology, p. 124, vol. 7, (2003). KITAMI, T., Data for Folate and Homocysteine Pathways in Mouse and Human, Department of Genetics, CWRU, Unpublished Manuscript, 2003. Michal, G., Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology, Wiley-Spektrum, 1998, ISBN: 0471331309.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.