Object-oriented video database Management system

Share Embed


Descrição do Produto

2





Kingdom of Saudi Arabia
King Abdul Aziz University
Faculty of Arts and Humanities
Department of Information Science




Object-oriented video database Management system



Researcher:
Jamila Ali Alzahrani
Master of Information Management
[email protected]










2012


Table of Contents
Introduction
P4
object-oriented database system (OODBS)
P4
II-1- Definition
P4
II-2- Main objective
P4
II-3- Characteristics of Object-Oriented Databases
P4
II-4- Advantages and Disadvantages of OODBMS
P6
II-4-A- Advantages and benefits of OODBMS
P6
II-4-B- Disadvantages or limitations of OODBMS
P7
III- Video Data
P7
III-1- video data characteristics:
P7
III-2- video frames:
P7
III- 3- Video data groups
P8
IV. Object-oriented video database Management System
P8
IV. 1- Definition
P8
IV. 2- Video database applications
P8
IV- 3- An object oriented video database management system (OOVDBMS) needs to address the following important issues
P10
IV- 3-1- Video data modeling
P10
IV- 3-2- Video data insertion
P11
IV- 3-3- Video data indexing
P12
IV- 3-3-1- Annotation-Based Indexing
P13
IV- 3-3-2- Feature-based Indexing
P14
IV- 3-3-3- Domain-Specific Indexing
P15
IV- 4- Video data query and retrieval
P15
IV- 4- 1- Query Processing
P15
IV- 4- 2- Query types
P16
IV- 4- 3- Query Certainty
P17
MPEG-7 standard, the ''Multimedia Content Description Interface'' / Video Metadata Standard
P17
V. 1- Video metadata requirements
P18
V. 2- It was designed to standardize:
P19
V. 3- MPEG-7 architecture requirement
P19
V. 4- MPEG-7 tools
P19
V. 5- MPEG-7 applications
P20
V. 6- MPEG-7 Multimedia Description Schemes
P21
V. 7- MPEG-7 Description Definition Language
P21
VI. MPEG-7 Compatible Representation of Video
P22
VI. 1- video decomposition
P23
VI-1-1- Temporal Decomposition of video into shots.
P24
VI-1-2- Temporal Decomposition of shots.
P24
VI-1-3- Compression Techniques
P25
VII. Making video as searchable as text
P26
Conclusion
P26
References
P27




















Object-oriented video database Management system
Introduction
Object-oriented video database Management system are to store, retrieve, and manage video. (video data). (Oomoto & Tanaka, 1993). Relational database systems have been successfully used for text and image management because of their basic advantages such as clear concept and formal basis. However, an entity relationship model is not sufficient for managing video information. Video data involves complex objects which need composition hierarchy and object specific operations. Therefore, object-oriented databases are considered as a possible alternative because of their power in behavioral modeling and behavior inheritance, complex objects support, type hierarchy, behavior encapsulation, and so on. (Huang, Mong Lee, Li, & Xiong).
object-oriented database system (OODBS)
II-1- Definition: An object-oriented database system (OODBS) is a database management system that supports the modeling and creation of data as objects. This includes some kind of support for classes of objects and the inheritance of class properties and methods by subclasses and their objects. (Kumar, 2005).

II-2- Main objective: is to provide consistent, data independent, secure, controlled and extensible data management services to support the object-oriented model. They were created to handle big and complex data that relational databases could not. (Chaterjee, 2005)

II-3- Characteristics of Object-Oriented Databases
Object-oriented database technology is a marriage of object-oriented programming and database technologies. (Chaterjee, 2005). It combines object-oriented programming with database technology to provide an integrated application development system. There are many advantages to including the definition of operations with the definition of data (Object database, 2012). Figure 1 illustrates how these programming and database concepts have come together to provide what we now call object-oriented databases. (Chaterjee, 2005)


(Chaterjee, 2005)
Inheritance: Inheritance is the most important feature of the object database as it gives the hierarchical relationships between different objects at different levels and gives code reusability. It helps in factoring and out shared the implementations and specifications in system. There are different types of inheritance like substitution inheritance, constraint inheritance, inclusion inheritance and specialization inheritance.
Encapsulation: Encapsulation has two views: programming language view and the database adaptation of that view. Encapsulation is the representation of the data object by its attributes and the various methods specified to manipulate those data objects. In this, the operations performed on the data objects are visible but the data and the implementation are hidden in those objects.
Object Identity: It is important that each object is uniquely identified from the whole database or similar kinds of objects. Each object has a unique identity and we can access and edit the object by using the same. It can be variable name or from a physical address space in memory. (Object database, 2012)
Polymorphism allow one to define operations for one object and then to share the specification of the operation with other objects. These objects can further extend this operation to provide behaviors that are unique to those objects. (Chaterjee, 2005)
Integrity: able to keep information separate and away of unauthorized adjustments.
Security: able to allow or band access to the database
Versioning: able to produce developed versions of the database
Transactions: able to transfer data
Persistence: It is the ability of the system data to preserve or survive during execution so that it can be further used by another process. Persistency provides the reusability.
Concurrency: A good system must have concurrency techniques. When number of users interacting with the user, the database system must provide same level of service to all the users. It should avoid the system failure, incomplete transactions. (Object database, 2012)
Query: provide ways of querying data.
Recovery: This feature also provides same level of service and should recover itself to original state if system suffers from the hardware or software failures. (Object database, 2012)
Archive: have a secondary storage management system which allows for storing and managing veery large amounts of data.

II-4- Advantages and Disadvantages of OODBMS:
Advantages and benefits of OODBMS:
Object-oriented is a more natural way of thinking.
The defined operations of these types of systems are not dependent on the particular database application running at a given moment.
The data types of object-oriented databases can be extended to support complex data such as images, digital and audio/video, along with other multi-media operations.
Different benefits of OODBMS are its reusability, stability, and reliability.
Relationships are represented explicitly, often supporting both navigational and associative access to information. This translates to improvement in data access performance versus the relational model.
Users are allowed to define their own methods of access to data and how it will be represented or manipulated.
The most significant benefit of the OODBMS is that these databases have extended into areas not known by the RDBMS. Medicine, multimedia, and high-energy physics are just a few of the new industries relying on object-oriented databases.
Disadvantages or limitations of OODBMS:
it lacks a common data model.
There is also no current standard, since it is still considered to be in the development stages. (Chaterjee, 2005)
III.Video Data
Traditional database systems are not suitable to manage videos since video data has its own characteristics, which differentiates it from simple textual or numerical data.
III-1- video data characteristics:
A video consists of a sequence of frames, which are nothing but images. Therefore, video data possesses all attributes of image data such as:
color,
texture,
object layout,
shape of objects, etc.
Moreover, there are attributes that differentiate video data from image data. like:
huge volumes of data,
audio content
temporal structure
III-2- video consists of the following frames:
Shot is a set of frames recorded in a single camera action.
Keyframe: which is one frame of the shot can be used to identify the shot.
Scene is a sequential collection of shots unified by a common event or locale.
Sequence is a collection of semantically related scenes that need not be consecutive in time. And sequences build up the video.
So video has a hierarchical structure, where sequences, scenes, shots and frames constitute the levels in the hierarchy.
III- 3- Video data can be divided into two groups:
Metadata about video
Data which is not extracted from video content like video name, production year, etc., can be modeled as metadata about video.
Extracted data from video content, and is also categorized into two groups:
physical data
low level features like color, texture, shape, and spatial relationships of objects can be specified as physical data.
semantic data.
Covers events, actions, attributes, and relations of objects. (Arslan, 2002)

Object-oriented video database Management System
IV. 1- Definition
An Object-oriented video database Management System can be defined as an object-oriented software system that manages a collection of video data and provides content-based access to users. (Elmagarmid, Jiang, Helal, Joshi, & Ahmed, 1997)
IV. 2- Video database applications
Video database applications call for flexible and powerful modeling and querying facilities, which require an integration or interaction between database and knowledge-based technologies. It is also necessary for many real life video database applications to incorporate uncertainty, which naturally occurs due to the complex and subjective semantic content of video data. (Nezihe BurcuOzgura, 2009)
Advances in multimedia computing technologies have offered new opportunities to
Store organize manage present video data in databases.
Object oriented technology on the other hand provides novel ways to organize the video items, by supporting both the temporal as well as spatial properties.
Many researchers come up with data models by
(Implementing object oriented concepts into multimedia technology).
The aim of this work is to make a video data model (Video Data Base Management System) by incorporating and integrating the above mentioned technologies. (A Mini thesis submitted for transfer of registration from M-Phil to PhD. University of Southampton U.K., 1999)
the major features provided by OODB systems are:
representation and management of complex objects;
handling object identities;
encapsulation of data and associated procedures into objects; and
inheritance of attribute structures and methods based on a class hierarchy.
These features are considered to be very suitable for data modeling and data management in a database systems, office information systems, and so on. Our basic question is as follows:
Is the power offered by conventional OODB features enough for the multimedia data, especially for the video data?
Its believed that it is necessary to provide a comprehensive selective constructs for video database management because of the following reasons.
1) Video data itself is raw data, and it is created independently from how its contents and its database structure is described later.
2) Meaningful scenes in video data are identified and associated with their descriptional data incrementally and dynamically after the video data is stored as a database. Therefore, it is not easy to identify meaningful scene objects and define necessary attributes for describing them when the video database schema is defined. It is desirable that each scene object can have an arbitrary attribute structure suitable for describing the contents when it is identified as a meaningful scene. But most current OODB's require predefinition of those attribute structures, and they do not support enough schema evolution facilities such as adding and dropping attributes.
3) Meaningful scenes are sometimes overlapped or included by other meaningful scenes. It becomes important to provide a mechanism to share some descriptional data among meaningful scenes with the inclusion relationships. In order to realize such an inheritance mechanism, the inclusion relationships among instance objects should be considered. But many OODBMS's support only class-based inheritance.(Oomoto & Tanaka, 1993)
IV- 3- An object oriented video database management system (OOVDBMS) needs to address the following important issues:
Video data modeling
Video data insertion
Video data indexing
Video data query and retrieval
IV- 3-1- Video data modeling
Object Oriented Database Management Systems are considered to be good for defining video models, as they can entertain both spatial as well as temporal features of any media. Another property of a media object is that it can define its own data rate, abstraction and other attributes. Again an AV (audio video) object can be accessed concurrently.
A lot of work is going on in the field of video data modeling and its applications, due to its high commercial value. Since video itself is a very complex object with both spatial and temporal properties. The functionality of a database is measured by data manipulation and query processing. A query for video data is very complex in nature. It is also possible that a database user can make a content based query (objects inside the video), as well as feature based (technical specifications of the video i.e. frame rate, segment change etc.). (A Mini thesis submitted for transfer of registration from M-Phil to PhD. University of Southampton U.K., 1999)
Video data modeling deals with the issue of representing the video data, that is, designing the high-level abstraction of the raw video to facilitate various operations. These operations include video data insertion, editing, indexing, browsing, and querying. Thus, modeling of the video data is usually the first thing done in the design process of a video database management system (VDBMS). It has great impact on other components of the VDBMS. The video data model is, to a certain extent, user and application dependent. (Elmagarmid, Jiang, Helal, Joshi, & Ahmed, 1997)
Object-Oriented Conceptual Modeling of Video Data
Understanding activities of objects moving in a scene by the use of video is both a challenging scientific problem and a very fertile domain with many promising applications.
The key characteristic of video data is the spatial/ temporal semantics associated with it, making video data quite different from other type of data such as text, voice and images. A user of video database can generate queries containing both temporal and spatial concepts. However, considerable semantic heterogeneity may exist among users of such data due to difference in their pre-conceived interpretation or intended use of the information given in a video clip. Semantic heterogeneity has been a difficult problem for conventional database, and even today this problem is not clearly understood. Consequently, providing a comprehensive interpretation of video data is a much more complex problem.
Automated video data-base system requires an effective and robust recognition of objects present in the video database. Due to the diverse nature of the video data, we can use various techniques currently available according to the requirements of different situations that may occur in the input. (Serhan DagtaS, 1995)

IV- 3-2- Video data insertion deals with the issue of introducing new video data into a video database. This usually includes following steps:
Key information (or features) extraction from video data for instantiating a data model. The automatic feature extraction can usually be done by using image processing and computer vision techniques for video analysis.
Break the given video stream into a set of basic units. This process is often called video scene analysis and segmentation.
Manually or semi-automatically annotate the video unit. What needs to be annotated is usually within the application domain.
Index and store video data into the video database based on the extracted information and the annotated information about video data. (Elmagarmid, Jiang, Helal, Joshi, & Ahmed, 1997)
For each input video clip, using a database of known objects, first identify the corresponding objects, their sizes and locations, their relative positions and movements, then encode this information in the proposed graphical model.
The encoded video data may be semantically highly rich. Therefore a unified framework is needed for the users to express and for the system to process semantically heterogeneous queries on the unbiased encoded data. Hierarchical scheme provides the necessary framework for a user to compose the views of the data with the maximum flexibility and at the same time allows processing of heterogeneous queries by evaluating the proposed graphical abstraction. For this purpose we also define an interface between these modeling paradigms. (Serhan DagtaS, 1995)
IV- 3-3- Video data indexing is the most important step in the video data insertion process. It deals with the organization of the video data in the video database to make user access such as querying or browsing more efficient. This process involves the identification of the important features and computing the search keys based on them for ordering the video data. (Elmagarmid, Jiang, Helal, Joshi, & Ahmed, 1997)

Due to the huge data volume of video database, accessing and retrieving video data item become time consuming. Indexing of the video data is needed to facilitate the process.

Compared to the traditional text-based database systems, video indexing is far more difficult and complex.
First, in the traditional DBMS, data are usually selected on one or more key fields (or attributes) that can uniquely identify data itself in the VDBMS. However, what to index on is not clear and easy to determine in video data indexing. This can be audio-visual features, annotations, or other information contained in the video.
Second, unlike textual data, content-based video data indexes are difficult to have been automatically generated. Video data indexing is closely related to how the video data is represented (video data modeling) and to the possible queries that the user can ask (video data query and retrieval).

Existing work on video indexing can be classified based on how the indexes are derived into three categories:
annotation-based indexing,
feature-based indexing, and
domain-specific indexing

IV- 3-3-1- Annotation-Based Indexing
The video annotation is very important for a number of reasons:
First, it fully explores the richness of the information contained in the video data.
Second, it provides access to video data based on its semantic content rather than just by its visual content like color distribution.
Unfortunately, due to the limitations of current machine vision and image-processing techniques, the fully automation of the video annotation process will remain impossible for a long time. Thus, video annotation is usually a manual process that requires human intervention. The annotation is usually done by an experienced user (film producer or librarian, for example), either as a part of the production process or as a post-production process.

Manual annotation has several drawbacks:

The cost is high as it is time consuming. It may only be suitable for inserting a small quantities of video data into the database but not for the large collection of video data.
Annotation is usually application dependent. Thus the annotation of the video data of a certain domain may not be applicable to other applications.
Annotation is usually biased and limited by the user doing the work.
Because of these reasons, the design of existing annotation-based indexing techniques is primarily concentrated on the selection of the indexing terms or keyword, data structures, and user interfaces to facilitate users' effort.
One of the earliest ideas for recording descriptive information of the film or video is the stratification model proposed by Davenport and Smith. The stratification model is a layered information model for annotating video shots. It approximates the way in which the editor builds an understanding of what happens in individual shots. To overcome the high cost of human annotation of video shots, they suggested that a data camera can be used during the video production process to record descriptive data of the video including time code, camera position and voice annotation of who-what-why information. This kind of annotation is also called source annotation by Hampapur. However, they didn't address the problem of converting this annotation information into textual description to create indexes of the video data.
Very often, a set of keywords can be selected to annotate the video data. However, this may not be a good approach as strongly criticized by Davis and shown in the following:
It is not possible to use only keywords to describe the spatial and temporal relationships, as well as other information contained in the video data;
Keywords cannot fully represent semantic information in the video data and do not support inheritance, similarity, or inference between descriptors (looking for shots of dogs will not retrieve shots indexed as German shepherds and vice versa;
Keywords do not describe relations between descriptions (a search using the keywords man, dog, and bite may retrieve dog bites man videos as well as man bites dog videos; the relations between the descriptions determines salience and are not represented by keyword descriptions alone);
Keywords do not scale (the more keywords used to describe the video data, the lesser the chance the video data will match the query condition).
IV- 3-3-2- Feature-based Indexing
Unlike the annotation indexing approach, feature-based indexing techniques are targeted at fully automating the indexing process. These techniques mainly depend on image-processing algorithms to segment video, to identify representing frames, and to extract key features from the video data. Indexes then can be built based on these key features. Key features can be color, texture, object motion, and so on. The advantage is that indexing processing can be done completely automatically and can be applied to various applications. Their primary limitation is the lack of semantics attached to the features, which causes problems and inconveniences to users who are attempting to specify video database queries. Hence, they are usually combined with a query graphical user interface for the user to define the query easily and domain-specific semantic annotations that enable the user to perform content-based queries.
IV- 3-3-3- Domain-Specific Indexing
Domain-specific indexing approaches use the logical (high-level) video structure models, say, the anchorperson shot model and CNN "Headline News" unit model, to further process the the low-level video feature extraction and analysis results.. After logical video data units have been identified, certain semantic information can be attached to each of them, and domain specific indexes can be built. These techniques are effective in their intended domain of application. The primary limitation of these techniques is their narrow range of applicability and limited semantic information through parsing video data. Most current research uses collections of well-structured logical video units as input, such as news broadcast videos. (Elmagarmid, Jiang, Helal, Joshi, & Ahmed, 1997)

IV- 4- Video data query and retrieval
The efficiency of a database is evaluated by the nature and complexities of the queries, that will be made about data. In terms of video the query and retrieval process becomes more complicated by the numerous demands placed on the system. (A Mini thesis submitted for transfer of registration from M-Phil to PhD. University of Southampton U.K., 1999)
Video data query and retrieval deals with the extraction of video data from the database that satisfies certain user-specified query conditions. Due to the nature of video data, those query conditions are usually ambiguous in that the video data satisfying the query condition are not unique. This difficulty can be partially overcome by providing a graphic user interface (GUI) and video database browsing capability to the users. The GUI of a video database can help the user to improve query formulation, result viewing and manipulation, and navigation of the video database. (Elmagarmid, Jiang, Helal, Joshi, & Ahmed, 1997)
IV- 4- 1- Query Processing
A browser enables a user to access all information related to a specific piece of video, querying makes it possible to formulate some conditions and then retrieve only the video material that have desired properties. In VDbMS, a query can give the output defined as exact match, in-exact match or similar match, where the major data indexing is based on annotations, video technical data such as frame rate, colour histogram, etc and audio data.
This process usually involves query parsing, query evaluation, database index search and the returning of results. In the query parsing phase, the query condition or assertion is usually decomposed into the basic unit and then evaluated. Along with text based search for annotations, feature based search is also applied to extract spatial contents like colour, motion, texture of the video data. Here thematic indexing will also aid to retrieve data for a query. Now the index structure of database is searched and checked. The video data are retrieved, if the assertion is satisfied or if the similarity measured is maximum and finally the resulting a video data are usually displayed by a GUI, developed by the user.
Elmagarmid describes the video data retrieval system in the following simple steps.
the user specifies a query using a facility provided by the user interface.
The query is then processed and evaluated.
The value or feature obtained is used to match and retrieve the video data stored in the database.
the resulting video data is displayed on the user interface in suitable form. (A Mini thesis submitted for transfer of registration from M-Phil to PhD. University of Southampton U.K., 1999)
IV- 4- 2- Query types
Since the video data is spatial and temporal, the queries are heavily dependent on their data content. Along with the architecture of the video data model and intended applications are also many other factors that modify a query. A query can be divided into:
Query by Content:
Query by Nature:
These queries are further categorized as semantic information query (content information in the scene) , meta information query (scene description information) and audio-visual query (audio and visual feature of a scene)
These queries depend on the nature of the video content and can be further categorized in spatial or temporal aspects of the video



IV- 4- 3- Query Certainty
The certainty of a query can be specified in terms of the type of matching operator used to satisfy the query. A query can fall into Exact match, Inexact match, or similarity matched queries.
Hjelsvold, in his Video-STAR data model, defined a video query algebra, that allows the user to define complex queries based on temporal relationships between video stream intervals. These operations include normal Boolean set operations (AND, OR), temporal set operations (i.e. stream A equals stream B, A is before B, A meets B, A overlaps B, A contains 22 B, A starts B and A finishes B), annotations operations that are used to retrieve all annotations of a given type and have non empty intersections with a given input set and mapping operations that map the elements in a given set onto different contexts that can be basic, primary or video stream. (A Mini thesis submitted for transfer of registration from M-Phil to PhD. University of Southampton U.K., 1999)
MPEG-7 standard, the ''Multimedia Content Description Interface'' / Video Metadata Standard
To enable the resource discovery of audiovisual documents over the World Wide Web, it will be necessary to define content description standards or metadata standards for complex, multi-layered, time-dependent information-rich audiovisual data streams. (Hunter & Armstrong). In the past, a lot of effort has gone into generating descriptors and description schemes for video indexing but comparatively little research has been done on schemas capable of defining the structure, content and semantics of video documents and enabling validation and higher levels of automated content checking. In particular, this is the primary goal of the MPEG-7 standard, the ''Multimedia Content Description Interface'' under development by the MPEG group. (Hunter & Armstrong)
MPEG-7 is an ISO/IEC standard developed by MPEG (Moving Picture Experts Group), MPEG-7, formally named "Multimedia Content Description Interface", is a standard for describing the multimedia content data that supports some degree of interpretation of the information meaning, which can be passed onto, or accessed by, a device or a computer code. MPEG-7 is not aimed at any one application in particular; rather, the elements that MPEG-7 standardizes support as broad a range of applications as possible. (MPEG-7 Overview , 2004 )
V. 1- Video metadata requirements
MPEG-7 is compatible with the Video metadata Requirements
Hierarchical structure definitions. The schema must be able to constrain the structure to a precise hierarchy in which complete video documents sit at the top level. These in turn contain sequences, which contain scenes, which contain shots, which contain frames, which contain objects or actors.
Each level (or class) within the hierarchy must be constrained to possess only specific attributes.
Element and attribute inheritance. It should be possible to specify sub-classing with inheritance of attributes and elements from the upper to lower classes. In addition, sub-classes should be able to have their own additional attributes and elements. This allows efficient reuse and customization of document schemas.
Data typing. It must be possible to constrain the values of attributes to certain data types. Data types supported should include primitive data types, enumerated data types, controlled vocabularies, file types (images), URIs and complex data types (e.g. colour histograms, 3D vectors, graphs, RGB values, etc.). It should also be possible to specify multiple alternative schemes or data types for a particular attribute.
Cardinality within attributes should be representable. It must be possible to specify that an attribute can have zero, one or multiple values. Ideally the minimum and maximum number of attributes should also be specifiable, e.g. a scene must contain between 2 and 5 shots.
Spatio temporal specifications. The schema must be able to support the specification of temporal characteristics, e.g. begin and end time of segments and their duration. Similarly, it should be able to support spatial representation, e.g. regions within an image or motion along a line.
Spatial, temporal and conceptual relations. Spatial relations such as neighbouring objects and temporal relations such as sequential or parallel segments should be supported. Given such a relationship between two classes, it should also be possible to constrain specific attribute values of these classes. For example, the start and end times of scenes contained within a sequence, must lie within the start and end time of that sequence.
Human-readability. It is desirable rather than mandatory that both the schema and the description output from the schema should be human-readable.
Availability of supporting technologies such as parsers (capable of validating input descriptions), databases and query languages.
V. 2- It was designed to standardize:
a set of Description Schemes ("DS") and Descriptors ("D")
a language to specify these schemes, called the Description Definition Language ("DDL")
a scheme for coding the description
its functionality is the standardization of multimedia content descriptions. MPEG-7 can be used independently of the other MPEG standards - the description might even be attached to an analog movie.
V. 3- MPEG-7 architecture requirement
Description must be separate from the audiovisual content.
There must be a relation between the content and description. Thus the description is multiplexed with the content itself.
V. 4- MPEG-7 tools
MPEG-7 uses the following tools:
Descriptor (D): It is a representation of a feature defined syntactically and semantically. It could be that a unique object was described by several descriptors.
Description Schemes (DS): Specify the structure and semantics of the relations between its components, these components can be descriptors (D) or description schemes (DS).
Description Definition Language (DDL): It is based on XML language used to define the structural relations between descriptors. It allows the creation and modification of description schemes and also the creation of new descriptors (D).
System tools: These tools deal with binarization, synchronization, transport and storage of descriptors. It also deals with Intellectual Property protection.The figure 2. show the relation between MPEG-7 tools.

Figure 2. Relation between different tools and elaboration process of MPEG-7
V. 5- MPEG-7 applications
There are many applications and application domains which will benefit from the MPEG-7 standard. A few application examples are:
Digital library: Image/video catalogue, musical dictionary.
Multimedia directory services: e.g. yellow pages.
Broadcast media selection: Radio channel, TV channel.
Multimedia editing: Personalized electronic news service, media authoring. (MPEG-7, 2012 )

V. 6- MPEG-7 Multimedia Description Schemes
MPEG-7 Multimedia Description Schemes (also called MDS) comprises the set of Description Tools (Descriptors and Description Schemes) dealing with generic as well as multimedia entities.
Generic entities are features, which are used in audio and visual descriptions, and therefore "generic" to all media. These are, for instance, "vector", "time", textual description tools, controlled vocabularies, etc.
Apart from this set of generic Description Tools, more complex Description Tools are standardized. They are used whenever more than one medium needs to be described (e.g. audio and video.) These Description Tools can be grouped into 5 different classes according to their functionality:
Content description: representation of perceivable information
Content management: information about the media features, the creation and the usage of the AV content;
Content organization: representation the analysis and classification of several AV contents;
Navigation and access: specification of summaries and variations of the AV content;
User interaction: description of user preferences and usage history pertaining to the consumption of the multimedia material. (MPEG-7 Overview , 2004 )
V. 7- MPEG-7 Description Definition Language
According to the definition in the MPEG-7 Requirements Document the Description Definition Language (DDL) is:
"... a language that allows the creation of new Description Schemes and, possibly, Descriptors. It also allows the extension and modification of existing Description Schemes."
The DDL is based on XML Schema Language. But because XML Schema Language has not been designed specifically for audiovisual content description, there are certain MPEG-7 extensions which have been added. As a consequence, the DDL can be broken down into the following logical normative components:
The XML Schema structural language components;
The XML Schema datatype language components;
The MPEG-7 specific extensions.

VI. MPEG-7 Compatible Representation of Video
The representation of video is crucial since it directly affects the system's performance. There is a trade-off between the accuracy of representation and the speed of access: more detailed representation will enable more detailed queries but will also result in longer response time during retrieval. Keeping these factors in mind, here is MPEG-7 profile shown in Figure 3. (MPEG-7 Overview , 2004 )


Figure 3: MPEG-7 Profile (Muhammet Ba¸stan, 2009)

As it makes the system able to support the wide range of queries it is designed for:
First, audio and visual data are separated using Media Source Decomposition.
Then, visual content is hierarchically decomposed into smaller structural and semantic units.
An example of video decomposition according to this profile is shown in Figure 4. (Muhammet Ba¸stan, 2009)

VI. 1- video decomposition


Figure 4: MPEG-7 decomposition of a video according to the MPEG-7 profile in (Figure 3). Low-level color, texture and shape descriptors of the Still and Moving Regions are extracted from the selected arbitrarily shaped regions, but the locations of the regions are represented by their MBRs.
VI-1-1- Temporal Decomposition of video into shots.
Video is partitioned into non overlapping video segments called shots,
each having:
a temporal location (start time and duration),
semantic annotation to describe the objects and/or events in the shot with free text, keyword and structured annotation
visual descriptor (e.g., motion, GoF/GoP descriptors).
A shot is a sequence of frames captured by a single camera in a single continuous action.
Shot boundaries are the transitions between shots. They can be abrupt (cut) or gradual (fade, dissolve, wipe, morph).
VI-1-2- Temporal Decomposition of shots.
The background content of the shots does not change much, especially if the camera is not moving. This static content can be represented with a single keyframe or a few keyframes if there is a considerable amount of change in the visual appearance (e.g., in case of camera motion).
Therefore,
each shot is decomposed into smaller, more homogeneous video segments (keysegments) which are represented by keyframes.
Each keyframe is described by:
a temporal location,
semantic annotation
a set of visual descriptors. The visual descriptors are extracted from the frame as a whole.
Each keyframe is also decomposed into a set of non-overlapping Still Regions (Spatio-temporal Decomposition)
That to be able to keep more detailed region-based information in the form of spatial location by the Minimum Bounding Rectangle (MBR) of the region, semantic annotation and region-based visual descriptors.
Spatio-temporal Decomposition of shots into Moving Regions.
Each shot is also decomposed into a set of Moving Regions to represent the dynamic and more important content of the shots corresponding to the salient objects. Hence, more information can be stored for Moving Regions to enable more detailed queries about salient objects.
The term "Moving Regions", as used in MPEG-7, is somewhat confusing in this context. The objects do not need to be moving to be qualified as Moving Regions; they should only be salient. Hence, a salient stationary object in a shot is represented with a Moving Region.
Faces are also represented with Moving Regions, having an additional visual descriptor: Face Recognition Descriptor.
Since the position, shape, motion and visual appearance of the salient objects may change throughout the shot, descriptors sampled at appropriate time points should be stored. The trajectory of an object is represented by the Motion Trajectory descriptor. The MBRs and visual descriptors of the region throughout the shot are stored by temporally decomposing the object into Still Regions. A new sample is taken at any time point (key time point) at which there is a certain amount of change in the descriptor values compared to the previous time point. (Muhammet Ba¸stan, 2009)
VI-1-3- Compression Techniques
Compression is the key to low data rate for digital video and a number of studies have undertaken to devices suitable compression algorithms and techniques. These methods can be categorised into lossy or loss-less compression methods. (A Mini thesis submitted for transfer of registration from M-Phil to PhD. University of Southampton U.K., 1999)


VII. Making video as searchable as text
Video data must allow some degree of interpretation, which can be passed onto, or accessed by a device or a computer code. MPEG-7 aims to create a standard for describing these operational requirements.
MPEG-7 began as a scheme for making audiovisual material as searchable as text is today. Indeed, it's conceivable that the structure and discipline to even minimally describe multimedia may exceed the current state of textual information retrieval. Although the proposed multimedia content descriptions now serve as much more than search applications, they remain the primary applications for MPEG-7. These retrieval applications involve databases, audio-visual archives, and the Web-based Internet paradigm (a client requests material from a server).
TV and film archives represent a typical application in this domain. They store vast amounts of multimedia material in several different formats (digital or analog tapes, film, CD-ROM, and so on) along with precise descriptive information (metadata) that may or may not be precisely timecoded. This metadata is stored in databases with proprietary formats. An enormous potential interest exists in an international standard format for the storage and exchange of descriptions that could ensure
interoperability between video archive operators,
perennial relevance of the metadata, and
a wider diffusion of the data to the professional and general public. (Nack & Lindsay, 1999)

Conclusion:
The researcher of this paper sees that object-oriented video database Management system is a field need more research especially by information specialists and librarians; despite of the current standards and schemas for indexing, annotating, describing and coding video but there are many limitations needs enhancements. And here comes the role of library and information science to produce the tools of metadata and indexing schemas, annotations, ontology, etc.. that capable to control, describe, index, and code videos so that it makes video database system possess an accurate retrieval functionality.

References
Arslan, U. (2002, January). A Semantic Data Model and Query Language for Video Data. Retrieved 5 14, 2012, from bilkent.edu.: http://www.cs.bilkent.edu.tr/index.php?p=msctheses&l=tr
Chaterjee, J. (2005, January 03). Introduction to RDBMS, OODBMS and ORDBMS. Retrieved 5 25, 2012, from aspfree: http://www.aspfree.com/c/a/Database/Introduction-to-RDBMS-OODBMS-and-ORDBMS/
Elmagarmid, A. K., Jiang, H., Helal, A. A., Joshi, A., & Ahmed, M. (1997). VIDEO DATABASE SYSTEMS Issues, Products and Applications.
Huang, L., Mong Lee, J. C., Li, Q., & Xiong, W. (n.d.). An Experimental Video Database Mangement System Based on Advanced Object-Oriented Techniques. Retrieved 5 15, 2012, from Hong Kong University of Science and Technology.
Hunter, J., & Armstrong, a. L. (n.d.). A comparison of schemas for video metadata representation. Retrieved 5 2012, 14, from ra.ethz.ch: http://www.ra.ethz.ch/CDstore/www8/data/2179/html/bindex.html
Kumar, K. (2005, September). object-oriented database management system (OODBMS or ODBMS). Retrieved 5 25, 2012, from searchoracle.techtarget.com: http://searchoracle.techtarget.com/definition/object-oriented-database-management-system
MPEG-7. (2012 , April 12 ). Retrieved 5 15, 2012, from wikipedia: http://en.wikipedia.org/wiki/MPEG-7
MPEG-7 Overview . (2004 , October ). Retrieved 5 16, 2012, from INTERNATIONAL ORGANISATION FOR STANDARDISATION: http://mpeg.chiariglione.org/standards/mpeg-7/mpeg-7.htm
Muhammet Ba¸stan, H. C. (2009, April 28 ). A MPEG-7 Compatible Video Retrieval System with Integrated Support for Complex Multimodal Queries. Retrieved 5 2012, 14, from www.cs.bilkent.edu.: http://www.cs.bilkent.edu.tr/tech-reports/2009/BU-CE-0905.pdf
Nack, F., & Lindsay, A. T. (1999, July–September). Everything You Wanted to Know About MPEG-7: Part 1. Retrieved 5 16, 2012, from lass.cs.umass.edu: http://lass.cs.umass.edu/~shenoy/courses/spring01/papers/ieeemm-mpeg7.pdf
Nezihe BurcuOzgura, M. A. (2009, February 26 ). An intelligentfuzzyobject-orienteddatabaseframeworkforvideo database applications. ScienceDirect , pp. 2253–2274.
Nicolas Moënne-Loccoz, B. J.-M. (2004, June 13 ). Managing Video Collections At Large. Retrieved 5 17, 2012, from cvdb04.irisa: http://cvdb04.irisa.fr/Talks/8_p25_Moenne.pdf
Object database. (2012, May 31 ). Retrieved 5 25, 2012, from wikipedia.org: http://en.wikipedia.org/wiki/Object_database
Oomoto, E., & Tanaka, ,. K. (1993, AUGUST ). OVID: Design and Implementation of a Video-Object Database System. IEEE TRANSACTIONS ON KNOWLEDGE AND DNA ENGINEERIN , 4, pp. 629-643.
Serhan DagtaS, M. I. (1995). Object-Oriented Conceptual Modeling of Video Data. Retrieved 5 15, 2012, from ieeexplore.ieee.org: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=380357
Taschwer, M. (2004, Sep 27 ). Muvino - An MPEG-7 Video Annotation Tool. Retrieved 5 16, 2012, from vitooki: http://vitooki.sourceforge.net/components/muvino/code/
Video Database Management System. (1999, May). Retrieved 5 16, 2012, from eprints.soton.ac.uk: http://eprints.soton.ac.uk/266934/1.hasCoversheetVersion/1999_transfer_thesis.pdf
Walid G. Aref, A. C. (n.d.). A Video Database Management System for. Retrieved 5 14, 2012, from .cs.purdue.edu: http://www.cs.purdue.edu/vdbms/papers/MISvdbms.pdf


Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.