7HPSRUDO&RORU&RUUHORJUDPVIRU9LGHR5HWULHYDO Mika Rautiainen
David Doermann
MediaTeam Oulu Infotech Oulu, University of Oulu P.O.Box 4500, FIN-90014, Oulu, Finland
[email protected]
Laboratory for Language and Media Processing Institute for Advanced Computer Studies University of Maryland, College Park, MD 20742
[email protected]
$EVWUDFW
color spaces. Typical image retrieval systems use a Munsell color system (MTM, HSV) [11,17]. Using HSV for correlograms is of interest because the HSV color space provides a better distinction between real color properties (hue, saturation and illumination). We have examined different quantizations of the HSV color space in [1], improving the correlogram’s sensitivity to changes in color content and resilience to varying illumination conditions. The efficiency of properly quantized HSV Color Correlograms was demonstrated against RGB autocorrelograms and HSV histograms [1]. The testing was done in a Content-Based Multimedia Retrieval System (CMRS), a java-based software platform that has built-in extensibility to cope with large databases and multiple features [4]. This paper extends HSV Color Correlogram to create a temporal video feature descriptor for retrieval. The proposed Temporal Color Correlogram (TCC) is evaluated against static keyframe representations of shots with HSV Color Histograms and HSV Color Correlograms that were presented in [1]. CMRS is used for the evaluation process: first storing the video segments and calculating features, then retrieving the segments by giving the system example segments. The paper is organized as follows. Section 2 describes the methodology and defines the features used and the (dis)similarity measures. Section 3 presents the data used in this research and gives experimental results. Section 4 provides a discussion of the results and summarizes the work.
7KLV SDSHU SUHVHQWV D QRYHO PHWKRG WR UHWULHYH VHJ PHQWHG YLGHR VKRWV EDVHG RQ WKHLU FRORU FRQWHQW 7KH 7HPSRUDO &RORU &RUUHORJUDP FDSWXUHV WKH VSDWLR WHPSRUDOUHODWLRQVKLS RI FRORUVLQD YLGHR VKRWXVLQJ FR RFFXUUHQFH VWDWLVWLFV 7KH 7HPSRUDO &RORU &RUUHORJUDP H[WHQGV WKH+69&RORU&RUUHORJUDPWKDW KDVEHHQIRXQG WR EH YHU\ HIIHFWLYH LQ FRQWHQWEDVHG LPDJH UHWULHYDO 7HPSRUDO &RORU &RUUHORJUDPV FRPSXWH WKH DXWRFRUUHOD WLRQ RI TXDQWL]HG +69 FRORU YDOXHV IURP D VHW RI IUDPH VDPSOHV WDNHQ IURP D YLGHR VKRW ,Q WKLV SDSHU WKH HIIL FLHQF\ RI WKH 7HPSRUDO &RORU &RUUHORJUDP DQG +69 &RORU&RUUHORJUDPVDUHHYDOXDWHGDJDLQVWRWKHUUHWULHYDO V\VWHPV SDUWLFLSDWLQJ WKH 75(& 9LGHR WUDFN HYDOXDWLRQ DQG DJDLQVW FRORU KLVWRJUDPV XVHG FRPPRQO\ LQ FRQWHQW EDVHGUHWULHYDO:HXVHGTXHULHVDQGUHOHYDQFHMXGJPHQWV RQWKHKRXUVRIVHJPHQWHG03(*YLGHRSURYLGHGWR WUDFN SDUWLFLSDQWV 7HVWVDUHH[HFXWHGXVLQJRXU &RQWHQW %DVHG 0XOWLPHGLD 5HWULHYDO 6\VWHP WKDW ZDV VSHFLILFDOO\ GHYHORSHG IRU PXOWLPHGLD LQIRUPDWLRQ UHWULHYDO DSSOLFD WLRQV
,QWURGXFWLRQ
Content-based retrieval became an area of active research the early 1990’s [3] and a large number of experimental image retrieval systems have since been introduced [5,6,8,9,11]. Color-based retrieval first evolved from simple statistical measures such as average color to color histograms and spatial color descriptors [9,5,8,11,12,13]. Unfortunately, color histograms have limited discriminative power with respect to the spatial organization of colors. The color correlogram and the autocorrelogram on the other hand, describe the probability of finding color pairs at fixed pixel distances and provide an efficient spatial retrieval feature. Huang et al [12] have demonstrated that the autocorrelogram provides very good retrieval performance in comparison to histograms and color coherence vectors and have developed their method further in [14,15]. Ma and Zhang [16] benchmarked color correlograms with color histograms, color moments and color coherence vectors. In [1], Huang’s correlogram was incorporated into more perceptual HSV color space, which gave improvement over the original method that used the color correlogram in conjunction with RGB [12,14,15] and L*u*v* [16]
7KH7HPSRUDO&RORU&RUUHORJUDP 9LGHRVWUXFWXUH
The primitive element in digital video is the frame (noted here as ,). In digital video analysis, the longest sequence of frames that creates continuous, consistent movement within mechanical limitations of recording device is called a VKRW (noted here as 6). Shot properties, such as consistent motion flow, contain ‘mechanical’ information about the shot and will be utilized by analysis algorithms for video indexing purposes. More detailed description of video structure can be found in [18].
1051-4651/02 $17.00 (c) 2002 IEEE
+LVWRJUDPVFRUUHORJUDPVDQGDXWRFRUUHOR JUDPV
The mathematical basis for calculating correlograms from a single frame is as follows: Let , be an ;x