ASIS MDL: A prototype electronic content service

Share Embed


Descrição do Produto

ASIS MDL: A Prototype Electronic Content Service Sin-Yuan Iap, Hung-Yu Kao, Chin-Fu Ku, Yau-Tsung Lee, Shian-Hua Lin, Yu-Chung Pan, Chi-Sheng Shih, Chia-Hui Wang, Yu-Chung Wang, Meng-Chang Chen, Jan-Ming Ho, and Ming-Ta Ko Institute of Information Science, Academia Sinica, Taipei, Taiwan. { iap, bobby, chinfu, ytsung, shlin, peter, csshih, chwang, wycc, mcc, hoho, mkto }@iis.sinica.edu.tw

In this paper, we present our design and implementation of ASIS MDL, a prototype for electronic content service in an academic library environment. Besides bibliographic information, our service includes an archive of digital audio and video streams, CD titles, and WWW contents. This system not only aims at end customer supports, e.g., searching, streaming, and browsing operations, but it also facilitates admission control, content production, distribution processes, and license control, etc. We address several design issues, e.g., the need for accounting support, for integrating third party system components, and for a computer aided tool to automatically categorize WWW documents. Our solutions to these problems are presented.

contents in the digital library. It also facilitates admission control, content production, distribution processes, and license control, etc. Furthermore, it provides tools for librarians to interact with remote online users, and to aide in their daily maintenance, e.g., to add new contents to the digital library system and to analyze the statistics of content usage. Our design emphasizes on several important issues, e.g., the need for accounting support, for integrating third party components, and for advanced computer aided tools, e.g., to automatically categorize WWW documents. In the following, we will start with a discussion on system requirements in section 2. Then, the architecture of ASIS MDL is presented in section 3. In section 4, our design and implementation experiences are briefly summarized. In section 5, concluding remarks are given.

1. Introduction

2. System Requirements

Conventional libraries provide the services of collecting, cataloging, finding, and disseminating information in the forms of paper and other material. Due to rapid growth in digital media and network technology, development of digital libraries has been a research focus [1][2][3]. A digital library provides conventional and extended library services in electronic format. ASIS MDL provides software tools to facilitate library users to search for desired information, to browse the entire classification hierarchy, e.g., ACM Computing Classification System, and to view each electronic content. It also provides tools for librarians to interact with remote online users, and to aide in their daily maintenance, e.g., to add new contents to the digital library system and to analyze the statistics of content usage. In this paper, we present our design of a multimedia digital library, ASIS MDL, as a prototype of electronic content service. It consists of bibliographic data as well as an archive of digital audio and video streams, CD titles, and WWW contents. It supports the end users to search for information, to browse the entire classification hierarchy, e.g., ACM Computing Classification System in our current implementation, and to view electronic

In designing ASIS MDL, we emphasize on providing electronic multimedia content service in a broadband network. We present in the following design requirements for three major system components of ASIS MDL, i.e., multimedia server, media preprocessing and management, and integration of multimedia retrieval and playback tools for end users.

Abstract

2.1. Type of Digital Media Contents Integrated Library Services Interface

and

Besides bibliographic data, four types of digital contents, i.e., audio, video, WWW, and CD titles are currently stored in ASIS MDL. To provide user aides in locating desired information, ASIS MDL supports two basic operations, i.e., browsing in bibliographic order and searching for keywords. When a user locates an item and chooses to playback a multimedia title, the system has to invoke the corresponding interactive playback tool for her/him. We expect that the system provides an integrated interface for the user such that invocation of playback tool is transparent to the user. We also expect that the system to be scalable in that a new title or a new media type can be added to the system without much effort.

2.2. Storage and Transport Control of Real-time Continuous Media Large storage space is usually required to store multimedia data. For example, it requires 1.1Gbytes of disk space to store a 100-minute movie digitized and compressed in MPEG-1 format. An archive containing 1,000 such MPEG-1 video streams thus takes 1.1 Tera bytes of storage space. Besides space requirement, a multimedia content server must guarantee quality of service. Typical indices for quality of service includes system delay, delay jitter, probability of packet loss. In order to fulfill these requirements, resource management and scheduling for real-time data retrieval and transport are critical design considerations. In a content server, the stored media is usually variable bit rate, VBR in short, in nature. Furthermore, the magnetic surface of many modern hard disks is partitioned into several disk zones each with different data transfer rates and sizes. It is not a trivial problem where to store the media data in the hard disk and how to schedule the retrieval of each block of data from the hard disk [4][5]. Besides data layout and I/O scheduling, network transport control, e.g., packet scheduling [6] and error control, is also important in optimizing service quality subject to buffer size requirement at client side. It further complicates the problem because the amount of network bandwidth dedicated to a particular multimedia session is unpredictable. This is due to the fact that current IP-based network technology is based on best-effort service model. Research [7] on providing QoS is very active recently. Several working groups of IETF, including Resource Reservation Setup Protocol (RSVP), [8] Integrated Services [9], Integrated Services over Specific Links [10], and QoS Routing [11], are developing standards at this aim. The idea is to have multimedia sessions to request for admission control and resource reservation before transporting data packets. Each intermediate nodes in the network then performs policing and packet scheduling functions for each packet passing through it. Before QoS guarantee becomes a common practice, we can only rely on a careful planning of network topology, selection of network devices, and a conservative allocation and control of bandwidth allocated to each multimedia session. Besides, it is also necessary to constantly monitor traffic loading of each node and each link of the network in order to adjust system admission control parameters dynamically.

2.3. Back-end Editing and Preprocessing of Electronic Contents As a new multimedia title, e.g., a music or video tape, laser disk, or video disk, etc., is inserted to the digital library, digitization and compression are only basic operations. In addition, parameters, which can be

computed only once and will be used repetitively, are also off-line generated and stored to minimize online system overhead, e.g., disk layout plan, I/O and packet transport schedules, etc. We also encapsulate a multimedia stream in RTP payload format. This is important for the system to provide appropriate error control capability. To support VCR operations, e.g., fast and rewind, the scheme presented in [12] uses extra storage to eliminate the need to change allocated network bandwidth dynamically. These extra information are also generated at the time the new title is inserted. Furthermore, index for keyword searching is also generated and inserted in the mean time. Annotation information, including licensing and descriptions of the contents, is also inserted to the system at this time for future use. Content distribution through data network and remote content management are also important problems currently under study.

2.4. Standards and Scalability Legacy library data of Institute of Information Science, Academia Sinica, is stored in an information system in standard MARC format. But, the current standard MARC does not fully describe the characteristics of multimedia contents. Local extension to MARC standard is thus necessary. It is also preferred for the multimedia subsystems to be compatible with existing standard communication protocols, e.g., DAVIC for video on demand and H.323 for Internet video conference. Currently, service area for ASIS MDL is within our institute. In the near future, Academia Sinica, Ministry of Education, and Nation Science Council, Taiwan, are upgrading network bandwidth and devices as part of a national project on broadband research network. For example, layer 3 switches will be installed in Academia Sinica campus. The external links will also be upgraded to 45Mbps T3 lines. A T3 link dedicated for research purpose will also be used to bridge between Taiwan and the US. As technology evolves, we expect this network to support QoS services. The area where ASIS MDL can deliver broadband service will become larger and larger as these projects making progress. Design of ASIS MDL must anticipate for this growth.

2.5. System Management ASIS MDL also provides several management functions, e.g., statistics log and report generation, online monitoring of system usage, access control, and admission control. The access control function is used to prohibit users from abusing critical system data and functions, e.g., updating database and monitoring user behavior, etc. It is also used to enforce legal issues, e.g., to restrict the maximum number of users to simultaneously access a specific digital media. Local network configuration information, e.g., topology and network capacity, are also stored in the system so that admission control decisions can be made, based on

measured network load, to prevent from overloading system components. This is a primitive for preliminary guarantee on providing quality of service.

3. System Architecture In this section, we present the functionality and architecture of ASIS MDL.

3.1. System Functions In order to provide friendly user interface and a channel of software distribution, ASIS MDL provides the whole set of user operations, both for library users and librarians, through WWW interface. Keyword Search: This function allows users to search for desired contents containing given keywords in a specific attribute of objects, e.g., title, author, subject, or abstract, etc. Full-text search is also provided as a supplement though is much slower than keyword search in the current implementation. Browsing: This function allows users to locate a desired document by browsing the entire system database arranged in a pre-defined classification scheme. Each of media is classified separately, e.g., ACM computing classification system is used for bibliographic information. Video Hot Line: Users of ASIS MDL can use a WWW-based software package developed at our laboratory to directly connect to the physical library, and interact with a librarian for online help. The current implementation allows a librarian to service multiple users simultaneously. Usage Log and Report: Since the HTTP protocol is stateless, we need a special means to keep track of the multimedia sessions. In other words, in addition to the usage log recorded by the web server, ASIS MDL also records detailed usage information, e.g., IP address of the user, start and end of a session, duration of the session, etc., on the multimedia contents. These information are important base data for content and system management and also for planning purposes. ASIS MDL also provide an interface for librarian or system managers to online monitor the status of the

multimedia servers. Back-end Content Management: ASIS MDL also provides functions for librarians to insert and delete multimedia contents from the system. Librarians and authorized users can also modify the contents and the associated attributes of the contents. The input formats are interactive CD title, music tape or CD, videotape, laser disc, video CD, and live TV program, etc. These media are first digitized and compressed if necessary. The digital media is then stored in a designated media server and its attributes are stored in the database.

3.2. Architecture The ASIS MDL consists of four subsystems, i.e., management subsystem, media servers, back-end content management subsystem and the multimedia terminals (see figure 2). The management system contains the application server and the librarian terminals. The application server is the kernel of the multimedia digital library. It stores not only bibliographic data of library collections, but it also stores all kinds of control information as the basis to support control and management functions, e.g., object encapsulation, accounting, access control, access log, and admission control, etc. It guides a client to setup its connection to an appropriate multimedia server, and also monitors the behavior of each active client and could stop a user from abusing the system under special commands. In ASIS MOD, we have three types of multimedia servers, i.e., video server, CD server and video conference server. The video server currently provides real-time MPEG-I streaming services, i.e. users at remote terminals can playback MPEG-I streams pre-stored at a video server. VCR control commands are also supported. The CD server stores each interactive multimedia CD title in hard disks to allow several users to access the same title simultaneously. Through HTTP protocol, a user selects and plays a remote CD title without having to borrow the physical CD-ROM back from library and install it on a local hard disk. Multimedia data services,

U ser

B a c k -E n d S y ste m

M a n a g e m e n t S y s te m

L ib r a r i a n T e r m in a l

IN N O P A C

A p p lic a t io n S erver

S l id e M u s ic C D

D e li v e r y N e tw o r k

V id e o T a p e 1 00 -1 0 S w itc h N etw o rk

1 00 -10 S w itc h N etw o r k

Figure 1. Usage Log

M u l tim e d i a T e r m in a l

M u lt im e d i a Serv er

C D - tit le 1 0 0 B a se T S w itc h B ack-B on e N e tw o r k

V C O N S e rv e r

VCD B a c k -e n d P r o d u c tio n

V O D Serve r

Figure 2. System Architecture

M P E G f il e

C D S e rv e r

Multimedia Database

VOD

Videotape MPEG encoder VCD

User

Music CD Tape

MARC

Figure 3. The Interface of System Database e.g., music streams, slides and still images, except streaming video all run on the same host machine as the CD server. Note that these media-specific players are automatically invoked by our system, thus are transparent to the users. We use WWW browser interface and object oriented design approach to achieve this goal. The third is the conference server of our video conference system. It takes the responsibility of setting up the connecting requests from conference client and direct it to the librarian's terminal. Design details of this video conference system can be found in [13]. The system network is Ether switch-based to reduce the amount of network contention due to CSMA/CD protocol. Management and multimedia subsystems are connected to one of the layer-2 switches via 100Mbps fast Ethernet links. The remote terminals can connect to the backbone directly, to a switch, or a 10Mbps hub. Note that maximum utilization level of CSMA/CD Ethernet is usually no greater than 80%. Thus, our strategy in admission control limits number of users accessing our video server simultaneously for watching MPEG 1 video streams to be no greater than 4.

3.3. System Control Flow A sketch of the control flow of ASIS MDL is briefly described in this subsection (see figure 4). In designing a multimedia content service, some operating system dependent operations are required, e.g., to capture the events of start and end of an application program, to turn Client

b

Application INNOPAC

WWW

S

Power

Admission

Application

System

VOD

l

b Back-end d

Back-end MPEG

MP3 Player Multimedia

CD-title

CD Server

VCON

Multimedia

i VOD client

Cli MP3 TV

DAT to MPG

MPEG data (FF)

Image writer

Fast-forward generator

Query Interface

INNOPAC

WWW

data

MPEG data

d d CDDA MP3

VOD

Video

Video FF Database

VCON CA

Figure 4. System Flow

di

i

WAV CDDA

data

MP3 data MP3 encoder

Figure 5. Back-end processing flow off the power of a public PC, to kill a user process, etc. Unfortunately, this is not possible for a pure WW-based design due to stateless property of HTTP. To achieve these design goals, an application agent is launched with each invocation of an external application by the WWW browser. The application agent plays the role of the frontier of the application server, i.e., the control kernel of ASIS MDL. It translates ASIS MDL system commands to local operating system commands. It also monitors the behavior of the MDL applications to report to the application server, and is dictated by the application server to manage their behavior. The first step is the digitalization of multimedia data. For those audio-visual data, like videotapes or VCDs, we use MPEG hardware encoder and the format converter to digitize and compress audio-visual data into MPEG streams. Then we need to encode another fast-forward MPEG stream for each MPEG source stream individually. The encoding time and space overhead allows the VOD server to pump data at the same network bandwidth during both normal-speed and fast-forward states. For audio data, like music CDs, we first convert their format to WAV and then encode to MPEG layer-3 format by a MP3 encoder. In addition to digitization and compression, the back-end content manager also stores related meta data, including its bibliographic information, into the database on application server (see also figure 5). The INNOPAC library information system allows users to search for bibliographic information in pure text format. A hyperlink is provided in ASIS MDL for users to access INNOPAC service. The bibliographic content in INNOPAC is mirrored in the database of ASIS MDL. For multimedia title, its bibliographic descriptions are linked to its enhanced multimediaspecific descriptions. Thus, users are allowed to retrieve multimedia contents by searching in the bibliographic database. The bibliographic information is also organized hierarchically according to a standard classification system, i.e., ACM Classification System in current implementation. This indexing structure provides users a new mechanism to search for desired information. This is useful for users to search for

interact with librarians directly through our Video Hot Line (see figure 7).

3.3. Management

Figure 6. A Snapshot of VOD information on unfamiliar subjects. Once the bibliographic data associated with a multimedia title is displayed, he could browse or play it by clicking at the highlighted hyperlink anchor in the web page. At the click, the application server invokes an application agent at the client machine to service the request. The agent then sends a query to the admission control agent on application server for media information and the access rights of the user. If the request is legal and admitted, application agent will direct the application agent to compose a command for the WWW browser to invoke an appropriate application program. The application then reads further information from the application agent and connects to the appropriate media server for appropriate actions and contents. (see also figure 6) Librarians and the system manager can access system status and statistics from the system monitor. The monitor gets information of media usage from system database. If the manager wants to stop a connection, then the system monitor communicates with the application agent at the user site and force it to giveup the current service session. For users who have troubles using the library or those who need expert advises, they could call and

ASIS MDL also provides several management functions, e.g., statistics log and report generation, online monitoring of system usage, access control, and admission control. The access control function is used to prohibit users from abusing critical system data and functions, e.g., updating database and monitoring user behavior, etc. Local network configuration information, e.g., topology and network capacity, are also stored in the system so that preliminary admission control decisions can be made based on measured network load.

4. Implementation System flexibility is a major consideration in designing ASIS MDL. For example, new types of digital content service may become available in the future, users may prefer to use softwares provided by third party venders, and there are multiple choices of popular operating system environments, e.g., Windows 95 and Windows NT, etc. Because of these requirements, the architecture of ASIS MDL is designed as a general framework allowing the integration of various types of contents, services and softwares provided by different venders. Our design abstracts each media title as an object. Thus, a service type is modeled as an object class, and each client program for manipulating an object is modeled as a method. These descriptions are store in the database. Furthermore, an application agent is plugged in a client PC to interpret each description and translate it to appropriate local system actions. Thus, when a user requests to access a media object of ASIS MDL, the application server transmits objection descriptions to the application agent. The application agent then coordinate with the local operating system to execute the appropriate client program and guide it to connect to the corresponding media server. The client program then processes online media transported from the corresponding media server. The application agent is also responsible for monitoring and controlling the behavior of the client program. In the following, we present our ideas in designing the database and the application agent.

4.1. Database Schema

Figure 7. A Snapshot of the Video Hot Line.

The database of ASIS MDL is an extension of USMARC [14], which describes possible media such as books, film, tapes, etc. Since most media collected in libraries are books, we use “book” to represent any media. USMARC defines hundreds of fields to represent a book. However, given a book, only few fields, may be ten or twenty, are frequently used. A system implementing USMARC, for example INOPAC, stores the information of a book as a

concatenated string of tuples containing field identity followed by the value. It’s impractical to search desired information in such a flat file structure. In our system, organize information in a better structure, we use relational data model to model USMARC format. To reduce redundancy of the database, we categorize those fields into 12 attributes to form the table defining “Book”, which contains most general information defined in USMARC. For each record in “Book”, there are several items representing individual physical objects located in libraries. Thus, the table defining “Item” consists of information on physical books such as their “call numbers”, “locations in libraries”, etc. Consequently, many objects share the same information in “Book” and own their specific information defined in “Item”. Comparing with conventional library automation systems, a digital library system stores extra information describing digital media. These media-specific information are contained in the table denoted as “Meta”. These three tables are the essential part of system database. To search for desired information in the database, information stored in “Book”, “Item”, and “Meta” is indexed by keywords. Thus, the table “Keyword” stores each extracted keyword or term from each record in these tables. Tables “BoK”, “IoK”, and “MoK” are used to keep inverted index information. Which is a popular approach to speedup keyword search in information system designs. For statistics logging, the table “AccessLog” records information on each access of physical media. Each entry in the table consists of starting time, ending time, client IP, and user ID, etc.

4.2. Application Agent The application agent (also called the wrapper) hides the type of a media object such that it is transparent to the client. With this feature, a user does not have to configure his WWW browser for the appropriate applications to handle each MIME type. Other key feature of the agent includes admission control, dynamic mounting of remote media objects, launching setup and application routines, reporting client status, and receiving and interpreting system messages. Admission control: As multimedia objects stored in our media servers have their own copyright licenses, we need a mechanism to control number of users simultaneous viewing or playing the object. The application agent uses an UDP packet to request for authorization by the application server by sending the target media ID before mounting or connecting to the corresponding media server. If the agent fail to acquire authorization for the desire object, i.e., it receives a NAK message from the application server, then it aborts the session. The server also sends a short

message to inform the user the reason for denying his retrieval request. Dynamically mounting a media object: Since our CD-Title and Music-CD services reply on Microsoft windows network file system to map the remote object to a specific mount point. To use he drive letters efficiently, the application agent mount a remote media object dynamically. Another reason for dynamically mounting a media object is due to security considerations. We use a secret account and password which are known to the application agent only. Thus the user cannot use network resource without authorization. Application launching: The agent parses the object specific information transmitted from the application server, checks for the existence of the desire content's playback program, and decides to launch the setup program if otherwise. The agent can also kill the playback process if requested by the remote application server. Management of the client machines are thus made easier. Client status report: The agent keeps alive during the whole session when the user is playing a media object. It keeps on monitoring the status of the playback process and sends an UDP packet back to the application server promptly after the user terminates the playback session. These information are then stored as part of the AccessLog table of the database for generating usage statistics reports. The agent sends periodical heart-beat signal back to the application server to indicate if the application program is still alive. A similar mechanism is also used to detect and report network loading level of the client, which is the baseline data for making admission control decisions. Message relay: The agent receives messages from the application server during its life-time. At a librarian terminal, a librarian can send unicast, multicast, or broadcast messages to a designated group of .library users, e.g., to make system shutdown announcement.

5. Conclusion ASIS MDL has been providing service since November 24, 1997. In the first phase, except PCs in our laboratory, there are only two service PCs in the library for field test. Starting from May 27, 1998, this service is provided to the whole institute, i.e., to any PC in our LAN. The service can be accessed at http://apserver.iis.sinica.edu.tw. Except removing some known software bugs, major differences between these two versions are enhancements in back-end digital content production and management supports and replacing the hardware MPEG-1 decoder with a software decoder. Computing center of Academia Sinica is constructing a new broadband network environment

including backbone upgrade and a high-speed external link to TANET backbone. On the other hand, TANET will be upgraded to T3 by the end of the year. Part of this bandwidth will be dedicated to national communication research on wireless networking and broadband Internet. This network will also connect to vBNS network. In addition, ADSL and cable modem technologies will be used to provide broadband network links from researchers' community to the campus of Academia Sinica. High-performance network devices, e.g., layer 3 switch router, will also be used in these constructions so that network devices are not the bottlenecks for high-speed real-time data transfer. We are also studying QoS supports for Internet. These technology are generally believed to be crucial for multimedia content services. As digital counterpart of a physical library, ASIS MDL speeds up the rate of information transfer and provides an integrated and friendly interface for accessing multimedia information in the library. In the future, besides QoS supports, we are working with other research groups on fast digital input technology, e.g., Chinese OCR technology, information organization technology, and content based retrieval technology.

6. References [1] Henry M. Gladney, Edward A. Fox, Zahid Ahmed, Ron Ashany, Nicholas J. Belkin, and Maria Zemankova, “Digital Library: Gross Structure and Requirement”, Digital library ‘94. [2] Virginia E. Ogle and Robert Wilensky, “Testbed Development for the Berkeley Digital Library Project”, D-lib Magazine, July 1996, ISSN 10829873. [3] Andreas Paepcke, Steve B. Cousins, Hector GarciaMolina, Scott W. Hassan, Steven P. Ketchpel, Martin Röscheisen, and Terry Winograd, “Towards Interoperability in Digital Libraries: Overview and Selected Highlights of the Stanford Digital Library Project”, IEEE Computer, May 1996 [4] Y.C. Wang, S.L. Tsao, R.Y. Chang, M.C. Chen, J.M. Ho and M.T. Ko, “A fast data placement scheme for video server with zoned-disks”, the conference on Multimedia Storage and Archiving Systems II, part of SPIE's Voice, Video, and Data Communications, Nov. 1997 [5] Yu-Chung Wang, Shiao-Li Tsao, Meng Chang Chen, Jan-Ming Ho and Ming-Tat Ko, “File Layout Design of VBR Video on Zoned-Disks”, Second International Workshop on Real-Time Database, Burlington, VT, September 1997. [6] Ray-I Chang, Meng Chang Chen, Jan-Ming Ho and Ming-Tat Ko, “Designing the ON-OFF CBR Transmission Schedule for Jitter-Free VBR Media Playback in Real-Time Networks”, 1997 International

Workshop on Real-Time Computing Systems and Applications, RTCSA'97, Oct. 27-29, 1997. [7] Anujan Varma, “Hardware Implementation of Fair Queueing Algorithms for Asynchronous Transfer Mode Networks”, IEEE Communications Magazine, Dec 1997. [8] “Resource Reservation Setup Protocol (rsvp) Working Group”, Transport Area, IETF, http://www.ietf.org/html.charters/rsvp-charter.html. [9] “Integrated Services (intserv) Working Group”, Transport Area, IETF, http://www.ietf.org/html.charters/intserv-charter.html. [10] Integrated Services over Specific Link Layers (issll) Working Group”, Transport Area, IETF, http://www.ietf.org/html.charters/issll-charter.html. [11] http://www.ietf.org/html.charters/qosr-charter.html [12] Der-Jen Lu, Yu-Chung Wang, Jan-Ming Ho, MingTat Ko and Meng-Chang Chen, “Experience in designing a TCP/IP based VOD system over a dedicated network”, IEEE International Symposium on Consumer Electronics, December 2-4, 1997, Singapore. [13] Yu-Chun Pan, Chin-Fu Ku, Shie-Yuan Wang, ChiaHsiang Chang, Meng Chang Chen, Jan-Ming Ho, and Chiu-Feng Wang, “A Light-Weight Protocol for Video Conferencing”, 1995 Workshop on Distributed System Technologies and Applications, 1995, Taiwan, Taiwan. [14] Format Integration and Its Effect on the USMARC Bibliographic Format, 1995 ed.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.