DOT: A New Distributed Data Management System

Share Embed


Descrição do Produto

International Conference on Information and Communication Technology ICICT 2007, 7-9 March 2007, Dhaka, Bangladesh

DOT: A New Distributed Data Management System Aysha Akther, Kazi Shah Nawaz Ripon, and Kazi Masudul Alam Computer Science and Engineering Department, Khulna University Khulna, Bangladesh E-mail: masudul_alam [email protected], [email protected], manee_cseO2gyahoo.com

coherence. Decentralized DDMS have almost the exact opposite characteristics as centralized systems. The far-flung nature of these networks means the systems tend to be difficult to manage and that data in the system is never fully authoritative [3].

ABSTRACT In this paper we have proposed a new distributed data management system (DDMS) called DOT

perfoande h oenerou hst ofwhighprovidat host and heterogeneous Of high data mili mobility

capabilities. The storage sites of DOT are divided in some groups where a group of client sites is under an administrator site. Intra group and inter group communications in DOT are maintained by peer-topeer (P2P) connections. DOT follows decentralizedcentralized communication architecture. Each client site of a group manages its data autonomously. Each administrator has a rule processing sub-system where rules are defined by high level language. The unit of data in the DOT system is fragment. Residence of a particular file fragments or replicas are not fixed in particular sites rather they can move in their own group. DOT ensures network bandwidth utilization, file location transparency, effective resource utilization, efficient data sharing, high failure tolerance and possibility to collaborate in Wide Area Network.

A system combining centralized [4, 5] and decentralized [6] architecture enjoys some of the advantages of both. Decentralization contributes to the extensibility, fault-tolerance, and lawsuitproofing of the system. The partial centralization makes the system more coherent than a purely decentralized system, as there are relatively fewer hosts that are holding authoritative data. Manageability is about as difficult as a decentralized system (Table-1). Table 1: Comparison of different DDMS topologies

Manageable Coherent

1. INTRODUCTION Distributed Data is that information belonging to an organization, which resides on portable media and non-local devices such as home computers, laptop computers, CD-ROMs, personal digital assistants, wireless communication internet devices, repositories etc [1]. Distributed Data Management System is a computing system in which data is

received, processed, stored and then distributed among multiple workstations. Data can be received and processed on the same machines that store and serve it [2]. In case of centralized DDMS the primary advantage iS their simplicity. Because all data is concentrated in one place, centralized systems are easily managed and have no questions of data consistency or

Extensible

Fault-Tolerant Secure

Lawsuit-Proof Scalable

Centralized Topology

Decentralized Topology

Centralized Decentralized Topology

yes yes

no no yes

partially

no

no

no no

yes no

yesyes

yes maybe

no

yes

yes

apparently

2. ARCHITECTURE OF THE

PROPOSED SYSTEM In this section we have presented total architecture of our proposed system. DOT divides the whole system in groups where each group is administered by a peer which is called the group administrator (GA) (Fig. 1). Each GA is assumed to have high system configuration. Communication among client peers under the same group without utilizing GA as via is called intra group communication. In case of

184

inter group communication client peers of two different groups communicate using remote GA as a medium. As many groups are available in the system and no group interfere with the operation in other group hence many simultaneous communication is possible in the complete DDMS.

the GNS. If all GA fail to find the file information, only then DOT decides about the unavailability of the file information. But, if the GNS possesses the FID then corresponding GA is obtained. Next, the client site establishes a P2P connection with that administrator peer i.e. GA.

If the administrator site is client's home administrator peer then the administrator looks ntergroupNetwrk up in its Group File Directory (GrFD) for peers contain the file fragment(s) and informs the client site to establish P2P connection(s) with Namrerr Spac_the file occupier site(s) which is intra group communication (Fig. 2). In this case "query to data" i.e. processing in possessor approach is .............t.r.to...... to Lod h anageileht Proto co considered because "data to query" i.e. Group A3mitstfatbf +processing in the requesting site approach will E _ create very high overload on the client peers as X mi=st its resources are assumed to be limited. But Ch when, the administrator peer is not client's UseiCache home GA then that remote administrator peer looks up in its GrFD and establishes P2P connections with its member site(s) which occupy the file fragment(s). This, inter group communication (Fig. 3) uses "data to query" -r1System~A _ r Usef approach i.e. file fragments are brought into the GA cache for file distribution service based on knowledgebase decision making.

_ _

G1obs

X i2%%X1

Fig. 1 Architecture of the proposed system. 2.1 DOT Communication In this paper we have described our proposed system in the sense of file distribution. When a file is distributed in the system it is assigned a global and unique File Identifier (FID). Each file has fragment(s) and replica(s) which have mobility in its home group. Information about files under a group is stored in that corresponding administrator site's Group File Directory (GrFD). When, the file information is uploaded in the administrator site then they are available in the total system. Next any client of the same group or from other remote groups can access those files. The system has a Global Name Space (GNS) which stores the FID and its administrator's IP address in its database. When, the administrator site is not busy it uploads its newly added file information in the GNS. When the data of a FID is requested from a client peer, first of all it is sent to a GNS. If the FID is not available in the GNS it the request is sent to the client' s GA along with the FID. Then, the GA broadcast this request to all GA available in the

P2P

Client

Fig. 2: Intra-group communication Using the P2P connections, data from the remote administrator peer or home group client peer(s) is transferred to the requesting client peer. A load management negotiation occurs between a client peer and the remote administrator peer using the Load Management Protocol (LMP) because a slow receiver can be inundated by information from a fast

system hoping that possibly the GrFD is updated not

185

server.

2 P

lNS

< ~ '

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.