A clinical decision support system for cancer diseases

Share Embed


Descrição do Produto

A clinical decision support system for cancer diseases 2

3

4

André Cid Ferrizzi¹, Toni Jardini , Leandro Rincon Costa , Jucimara Colombo , Edmundo Carvalho Mauad5, Lígia Maria Kerr6, Geraldo Santiago Hidalgo7, Paula Rahal8, Carlos Roberto Valêncio9 1, 2, 3, 4, 8, 9 - UNESP, São José do Rio Preto, São Paulo, Brazil 5, 6, 7 - Hospital do Câncer de Barretos, Barretos, São Paulo, Brazil

ABSTRACT The second main cause of death in Brazil is cancer, and according to statistics disclosed by National Cancer Institute from Brazil (INCA) 466,730 new cases of cancer are forecast for 2008. The analysis of tumour tissues of various types and patients' clinical data, genetic profiles, characteristics of diseases and epidemiological data may lead to more precise diagnoses, providing more effective treatments. In this work we present a clinical decision support system for cancer diseases, which manages a relational database containing information relating to the tumour tissue and their location in freezers, patients and medical forms. Furthermore, it is also discussed some problems encountered, as database integration and the adoption of a standard to describe topography and morphology. It is also discussed the dynamic report generation functionality, that shows data in table and graph format, according to the user’s configuration.

1, 2, 3 – {ferrizzi, tonijardini, leandro.rincon.c} @gmail.com, [email protected], [email protected], 8, 9- {prahal, valencio}@ibilce.unesp.br place and over 6 thousand new cases of cancer are diagnosed every year, from all Brazilian states. The Barretos Hospital Tumour Bank is composed by freezers that are organized hierarchically, in which tissues of a variety of histological types are stored, as follows: tumour, normal, blood, leukocytes, serum, RNA, DNA and ascitic liquid. In this work we present a clinical decision support system for cancer diseases (SCGBT), which manages the Barretos Cancer Hospital relational database containing information relating to the patients, medical forms, tumour tissue and their location in freezers. The system presented here makes it possible to develop studies in the area of prognosis, diagnosis and therapeutic markers in representative samples of the Brazilian population.

J.3 [Life and Medical Sciences]: Medical information systems.

Furthermore, it is also discussed some problems encountered, as database integration between the Barretos Cancer Hospital database with another cancer database, and the adoption of a standard to describe topography and morphology. It is also discussed the dynamic report generation functionality, that shows data in table and graph format, according to the user’s configuration.

General Terms

2. RELATED WORK

Categories and Subject Descriptors

Algorithms, Management, Documentation, Design.

Keywords Medical Information System, Cancer Database, Bioinformatics.

1. INTRODUCTION The second main cause of death in Brazil is cancer, and according to statistics disclosed by National Cancer Institute from Brazil (INCA) 466,730 new cases of cancer are forecast for 2008 [1]. Out of these, 231,860 are expected to be new male patient cases, while 234,870 are expected to be new female patient cases. Cancer is diverse disease and its peculiar multiple genetic and epigenetic changes make its prevention, diagnosis and therapy difficult. Studies that are meant to establish the tumour’s molecular genetic profile are essential in order to understand the disease's complexity, establish the biological basis and provide means to identify the best therapeutic strategies, since, in spite of the developments in chemotherapy, in surgical techniques and in drug combinations, there are types of neoplasies in which there has been practically no prognosis improvement within the last ten years [2] [3]. The Barretos Cancer Hospital [4] rates among the largest cancer hospitals in Brazil, where over 400 thousand consultations take

Permission to make digital or hard copies of part or all of this work or personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. EAT,6 '0, 6HSWHPEHU -2, 200, $UDFDMX%UD]LO © ACM 200 ISBN: 978-1-55--/0/0...$10.00

In this section some related works are presented. The GLOBOCAN 2002 is a desktop system that provides estimates of the cancer incidence, mortality and prevalence, by sex and cancer site, for the countries of the world [5]. Another system, Cancer Population Registry, owned by INCA [6], which is used by some Brazilian states, stores information about cancer, that are defined in a standard form. The work presented in this paper involves a great data variety as race, smoking, obesity, among other characteristics set out in various forms for specific departments, as forms of breast cancer, kidney, testicular, bladder, and neurology, among others. In [7], a web based system was developed to show in graphical maps cancer statistics of the United States of America population. SCGBT deals with Brazil population’s data, providing current statistics in the tables and graphs ways. The presentation of statistical data under a Brazilian geographic map distribution is expected as a future work.

3. METHODOLOGY The software development process employed is the Unified Process – UP [8]; Modeling of some aspects of the system was performed with the Unified Modeling Language (UML) and for configuration management, Subversion (SVN) is used [9].

The programming language employed is PHP Hypertext Preprocessor (PHP) and he Database Management System (DBMS) used was MySQL. The project was supported by Project Management Body of Knowledge (PMBOK) [10] project management models and processes, focused on four main areas: Requirements Management, Configurations Management, Risks Management and Tests Management.

doctors, users, among other data. The system has been developed under a client-server architecture: clients use the system through their Web browser and the data are stored in the server, by a relational DBMS. Client-server communication is performed and encrypted by means of the HTTPS protocol. The flowchart illustrated in Fig. 1 shows the components that make up SGCBT, organized in three main layers: Presentation Layer, Application Logic Layer and Database Layer.

While the requirements list was being put together, several brainstorming sessions were held, among the Barretos Cancer Hospital doctors and staff, University professors and development team students, to talk about the data whose identification and storage in the system would really be needed. In order to support all document handling and the creation of source code, configuration management had to be employed. For this application, SVN was used. As soon as a new step of the project is developed, the whole system undergoes a series of tests in order to assure quality and guarantee that all requirements identified and agreed upon are, in fact, properly contemplated by SCGBT. The whole test management process was carried out in the project, encompassing white box and black box tests, among test phases: unit test (UT), functional verification test (FVT), system verification test (SVT), regression verification test (RVT), security test (ST) and performance test (PT).

4. DATABASE As stated elsewhere in this document, MySQL, a relational DBMS, stores the data managed by SCGBT. The database conceptual model was achieved using the UML to represent the entities and their relationships. In the database scheme, referential integrity was assured in order to maintain the integrity of the references between the tuples of the various tables. The main tables in the database are: patient; city and state; sample; topography and morphology: the International Classification of Diseases for Oncology Code (ICD-O) [11] is used; freezer: is the location where the samples are stored. A freezer comprises a set of shelves, racks, drawers and boxes, placed according to a hierarchy; researcher: a sample may be removed from the freezer for research purposes. When this happens, the information on the person who removed the sample must be entered, as well as the justification for the removal; user; forms: medical records data in connection with the samples. Show information on the patient's history, diagnosis, clinical state, treatment and prognosis; doctor: is the doctor in charge of the patient.

5. SCGBT COMPONENTS SCGBT is a Web system that manages all data associated with the samples, their locations in the freezers, patients, medical forms, Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Conference’04, Month 1–2, 2004, City, State, Country. Copyright 2004 ACM 1-58113-000-0/00/0004…$5.00.

Fig. 1. SCGBT components diagram.

6. DATA MANAGEMENT AND VISUALIZATION SGCBT comprises a number of functionalities, providing full handling of the data on each patient on record, as well as the tumour samples, forms, freezers, users and doctors. In order to retrieve data from patients, samples, forms, freezers, users and doctors the system offers an interactive interface for the definition of searches by the users, known as filtering. By means of this interface, the user is able to put together his/her search according to certain parameters. The user is able to add or remove search parameters, which include sample location, topography, morphology, patient’s age, type of sample, name, among others. Functionalities are offered to make the data entry process easier. For instance, when a user starts typing the description of a topography or morphology of a given sample, suggestions of morphologies or topographies are shown on the screen. By the same token, if, instead of typing the topography/morphology, the user types the corresponding code, the system automatically recognizes and suggests topographies/morphologies based on the code entered.

6.1 FORMS The forms are the main database items once they have most part of diseases patients’ information. Each form has information about patient history, its diagnosis, treatment, prognosis, among others. There are ten types of forms, which were formulated according to the needs of each department: adrenal, prostate,

bladder, testicles, penis and kidney of urology department, head and neck form, breast form, neurology form and general form, used by gynecology, prevention, upper digestive, hematology, orthopedics, chest, lower digestive and pediatric departments. Forms are contemplated in order to provide data to publish papers on cancer, such as the paper on penis cancer [12]. Initially our solution had low rate of filling out forms by the doctors, because there is a delay in a reasonable time to perform the surgery and the samples are collected and registered in the system. To provide a higher rate of filling in forms by doctors, there were allowed forms to be completed even before its samples were registered in the system, allowing doctors to fill form as soon as a surgery finishes. When the team makes the registration of samples collection, the system shows a warning if the patient has forms, allowing a sample to be linked to a form.

6.2 REPORTS Reports are provided by the system as tables and line, bar and pizza graphs. Each report is generated dynamically, according to the user settings. The reports are available on types of samples, forms, ages, department, the place of origin of the sample, freezer, forms, topography, morphology, nosology, sex, color and reporting periods. Each report can show the numbers of samples, patients and collections, and it is possible to restrict the results of each report according to some parameters: sample type, the place of origin of the sample, department, topography, morphology, nosology, begin date and end date. The data collection and management, by means of our computer system, began in 2006 and at the present juncture, after about two years’ time, the database has about ten thousand samples from about three thousand patients, 1605 (52%) of whom are male, and 1479 (48%) are female. With regard to the samples, 2669 are normal tissue samples, 3772 are tumour tissue samples, 353 are serum-type samples, 324 are leukocyte-type samples, 2 are ascitic liquid samples and 4683 are blood-type samples. Fig. 2 provides some data on the number of tumour samples collected per patient’s organ. Fig. 3 provides some data on the number of forms. As it shows, there are a large number of blood and leukocyte samples. The collection of these samples is of great importance, since the patients’ DNA and genetic characteristics can be extracted from this material, such as xenobiotics metabolism profile.

Fig. 3. Number of forms.

7. TOPOGRAPHIES AND MORPHOLOGIES CONVERTION Initially the topographies and morphologies were stored into the system manually, without any validation. Although the ICD-O was adopted as the standard for topographies and morphologies values, it took a long time to compose a database of ICD-O values with a subset of codes used by the hospital, and during this time it was necessary to allow the registration of samples, because the number of samples was very high, and it was impracticable to block the registration of samples until the ICD-O database deployment. After ICD-O deployment, it was necessary to convert the previous topographies and morphologies stored so far to the codes of ICDO. For this reason it was developed a simple integration application, which makes the correlation between what was entered manually and ICD-O codes.

8. DATABASE INTEGRATION The integration with other tumour banks is vital for a more comprehensive study of cancer cases in the country. Thus, SCGBT will make it possible to exchange information with other management systems using protocols that are simple but easily understood by the parties involved, since it will face a fairly heterogeneous environment comprising different programming languages and different data structuring. In this scope, the Barretos Cancer Hospital Tumour Bank are being connected to the A.C. Camargo Hospital Central Tumour Bank, in São Paulo [13]. The basis of the integration system is the YAML data structuring language (YAML Ain’t Markup Language) [14], a language that is similar to the widely-known XML, but which is easier for humans to read and understand. The data exchanged between the two systems will be formatted according to this language.

9. CONCLUSIONS In this paper was discussed the SCGBT, a clinical decision support system about cancer diseases that manages clinical data on patients, genetic profiles, disease characteristics and epidemiologic data in order to provide more precise diagnosis, and to be able to provide more efficient treatment with a higher likelihood of cure for cancer-related conditions. Fig. 2. Number of tumour tissue samples per topography.

The use of human tissue in the study is vital, since within the last few decades there has been a decrease in the use of animal cellular lineage and models in the study of cancer. This trend has taken place concurrently with the development of molecular studies and also with the conception of a the neoplasic phenomenon as a

heterotypical process in which both the neoplasic cell and the issue environment in which it develops play a key role, since, in addition to the genetic factors associated with the tumour, the individual-related factors interfere with the tumour and its response to treatment [15].

[4] The Barretos Cancer Hospital, www.hcancerbarretos.com.br/

10. FUTURE WORK

[7] Carr, D. B., Bell, S., Pickle, L., Zhang, Y., Li, Y.: The State Cancer Profiles Web Site and Extensions of Linked Micromap Plots and Conditioned Choropleth Map Plots. Proceedings of the 2003 Annual National Conference on Digital Government Research, Boston. (2003)

The integration with other cancer databases is fundamental for a more comprehensive study of cancer cases in the country. Thus, the Barretos Cancer Hospital database will be integrated with the the A.C. Camargo Cancer Hospital database.

[5] CANCERMondial, International Agency for Research on Cancer, www-dep.iarc.fr/ [6] Câncer no Brasil, Dados dos Registros de Base Populacional, INCA. www.inca.gov.br/

Another task to be performed is the improvement of the tool developed to convert topography and morphology codes. At the moment, the tool is very simple and improvements in the interface as well as in performance can be developed.

[8] Larman, C.: Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development. Bookman, 2nd ed.

The Barretos Cancer Hospital cancer database is rich in data. It can use data warehouse and data mining techniques [16] to extract knowledge-generating data, providing a better medical analysis of cancer-related diseases.

[10] Project Management Institute: A Guide to the Project Management Body of Knowledge: Pmbok Guide. Project Management Institute, 3rd ed. (2004)

At this time, the system stores textual data. In future works, it will be developed functionalities that can handle with images, extracting characteristics that can help in the diagnosis of cancer. It will be performed the inclusion of charts that may provide a view of the distribution of the various cancer-related data by geographic region, as in the system of [7].

11. ACKNOWLEDGMENTS We express our appreciation to FAPESP and PROPG (Próreitoria de pós-graduação) for providing financial support for this work; to Fundação Pio XII, to IBILCE – UNESP, to Tamara Colaiacovo for their collaboration to this project and Dr. Geraldo Santiago Hidalgo, pathologist responsible for the beginning of this project.

12. REFERENCES [1] Brazilian National Cancer Institute, www.inca.gov.br. [2] O’Connor, R.: The Pharmacology of Cancer Resistance. Anticancer Res, vol. 27, pp.1267--1272. (2007) [3] He, M., Rosen, J., Mangiameli, D., Libutti, SK.: Cancer Development and Progression. Adv Exp Med Biol, vol. 593, pp.117--133. (2007)

[9] Subversion, subversion.tigris.org

[11] Pan-American Health Organization, World Health Organization. ICD-O: International Classification of Diseases for Oncology, EDUSP, Portuguese Edition (2005) [12] Babeto, E., Pires, L. C., Valsechi, M. C., Ferrizzi, A. C., Valencio, C. R., Kerr, L. M., Faria, E. F., Seabra, D., Soares, F. A., Peitl, P. J., Rahal, P.: Differential Gene Expression Analysis in Penile Carcinoma. VIII São Paulo Research Conference: "Câncer 2007: da Biologia Molecular ao Tratamento" (2007) [13] AC Camargo Cancer Hospital Website, www.hcanc.org.br [14] YAML, www.yaml.org [15] Marahatta, S. B.: Cancer: Determinants and Progression. Nepal Med Coll J. vol.7, pp. 65--71. (2005) [16] Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publisher, São Francisco. (2006)

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.