Arabo-Moresque decor image retrieval system based on mosaic representations

Share Embed


Descrição do Produto

Journal of Cultural Heritage 2 (2001) 149−154 © 2001 Éditions scientifiques et médicales Elsevier SAS. All rights reserved S1296207401011165/FLA

Arabo-Moresque decor image retrieval system based on mosaic representations Arsalane Zarghilia*, Najib Gadia, Rachid Benslimanea, Kadi Bouatouchb a

LTTI, Ecole supérieure de technologie, route d’Imouzzer BP 2427, 30000 Fès, Morocco b IRISA, INRIA of Rennes, Campus Universitaire de Beaulieu, 35042 Rennes, France Received 5 July 2000; accepted 1 March 2001

Abstract – The paper describes a new method for indexing an Arabo-Moresque decor database. This method requires that the images in the database must have the whole principal geometric information, named ‘spine’ of the decor. In practice, it is difficult to respect this request. When the distance between the camera and the real scene (fresco in Zellij) is big, the whole spine is captured but the resolution of the image is bad. On the other hand, when this distance is small, the resolution is high but the spine is not completely captured. This motivates the use of the ‘mosaicing’ technique. The contribution of this work is the combination of the mosaicing and indexing techniques for the development of an Arabo-Moresque decor image retrieval system. © 2001 Éditions scientifiques et médicales Elsevier SAS image indexing / mosaicing / Fourier shape descriptors / spine / resolution

1. Research aims The aim of this paper is to aid both the artist and the restorer of the Arabo-Moresque decors by developing an automatic retrieval system in a large database image. This system is based on the extraction of the spine of the decor, which is defined as a connected set of polygonal shapes. Moreover, our system allows the restorer to have at his disposal all images belonging to the database that are similar to the query image. In fact, this query image corresponds to the original image of the degraded decor, which will be restored.

2. Introduction In the database community, there has been considerable effort in developing languages to aid in pictorial queries. Much of the focus has been in defining and constructing queries to include object–feature relationship, and in formulating imprecise queries. In *Correspondence and reprints. E-mail addresses: [email protected] (A. Zarghili), [email protected] (N. Gadi),[email protected] (R. Benslimane), [email protected] (K. Bouatouch).

integrating image analysis and vision tools, the emphasis has been on image attributes such as color, texture, and shape [1]. The shape of objects or of regions of interest is another important image feature for content-based image retrieval. Over the past few decades many approaches to the characterization of shape similarity have been proposed in the literature. An important class of shape analysis algorithms is based on the representation of the outer boundaries of object [1, 2]. Our aim is to index an Arabo-Moresque decor images database where each image is characterized by its principal geometric information, called ‘spine’ of the decor. As a result, the Fourier-based shape description is adapted and used in our image retrieval system. Moreover, each image in the database is generated from a sequence of small parts of the decor, using a planar mosaicing technique. In fact, we will discuss the topic of mosaicing pictures taken from the same plane, and joined together as a single higher resolution picture from that plane. The simplest possible set of images to mosaic are views of a planar scene such as documents [3], whiteboards [4] or walls.

150

A. Zarghili et al. / J. Cult. Heritage 2 (2001) 149–154

Figure 1. Spine: alternation between the Safts and the Salomon Stars. In the right image, the spine is shown in black.

The rest of the paper is organized as follows: Section 3 gives a presentation of some of the most typical Arabo-Moresque forms of art known as Zellij. Section 4 describes the method used to stitch together a set of pictures taken from the same planar scene. The result of the stitching is one big, high-resolution picture. Section 5 presents an image indexing technique based on Fourier shape descriptor and the result of the retrieval system.

3. Spine definition Zellij is an artisanal product which is a tile of cooked earthenware of 10 cm2 covered by enamel. These square tiles are cut manually with specific heavy hammers which contrast the delicacy of the pieces obtained. Always geometric, the monochromatic and polychromatic pieces numbering in the hundreds have many forms and sizes. Assembled, it forms a geometric puzzle, and each one according to its size and to its color, finds the place that is assigned to it. Most of Zellij decors are constructed from polygonal structures called ‘spine’ of the decor (see figure 1) which alternate between two pieces (see figure 2): – Salomon star: octagon with width r, from which all pieces are derived (see figure 3); – Saft: hexagon with width r, which can be deduced from two Salomon stars (see figure 2). In our context, and due to its periodicity, the spine can be considered as key information in the decor. In

Figure 2. (a) Salomon star. (b) Saft. (c) The Saft disposition between two Salomon stars.

Figure 3. Some Zellij pieces derived from the Salomon star.

fact, some regions in that decor can be repeated many times in order to cover a big area.

4. Image mosaicing Image mosaics are a collection of overlapping images together with coordinate transformations that relate the different image coordinate systems. By applying the appropriate transformations via a warping operation and merging the overlapping regions of warped images, it is possible to construct a single image covering the entire visible area of the scene. This merged single image is the explanation for the term ‘mosaic’. 4.1. The 2D projective matrix Many problems require finding the coordinate transformation between two images of the same scene or object. One of them is image mosaicing. It is important to have a precise description of the coordinate transformation between a pair of images. The basic idea is to find the transformation, or the best mapping from one image I' into the other I, and then to warp the image I' into the reference frame of I using that mapping. Before proceeding, we need to consider the geometric transformations that relate the images to the mosaic. To do this, we use homogeneous coordinates to represent points, that is, we denote 2D points in the image plane as (x, y, w). The corresponding Cartesian coordinates are (x/w, y/w) [5]. A coordinate transformation maps the image coordinate, x = [x, y]T to a new set of coordinates T x′ = 关 x′, y′兴 . Generally, the approach to finding the coordinate transformation relies on assuming one of the forms shown in table I, and estimating the two to twelve parameters in the chosen form. Using homogeneous coordinates, we can describe the class of 2D planar projective transformations, which can be computed without any knowledge of the

A. Zarghili et al. / J. Cult. Heritage 2 (2001) 149–154

Table I. Image coordinate transformations. Model

Coordinate transformation from x to x'

Translation Affine

x' = x + b x' = Ax + b x' = p1xy + p2x + p3y + p4 y' = p5xy + p6x + p7y + p8 x' = (Ax + b) / (Cx + 1) x' = p1x + p2y + p3 + p4x2 + p5xy y' = p6x + p7y + p8 + p9xy + p10y2 x' = p1x + p2y + p3 + p4x2 + p5y2 + p6xy y' = p7x + p8y + p9 + p10x2 + p11y2 + p12xy

Bilinear Projective Pseudoperspective Biquadratic

internal camera calibration parameters or of the relative camera motion between frames. These transformations use the following matrix multiplication:

冢冣 冢 x′

y′

=

w

m0 m1 m2 m3 m4 m5 m6 m7 1

冣冢 冣 x y

(1)

1

We know that two images belonging to a planar scene are related by a projective transformation. A coordinate transformation maps the image coordinate x belonging to image I, to image coordinate x′ belonging to image I′ by the following equations: x′ =

m0 x + m1 y + m2 m3 x + m4 y + m5 ; y′ = m6 x + m7 y + 1 m6 x + m7 y + 1

(2)

The problem is reduced to determine the transformation parameters mi between every two adjacent images, in order to merge the images into a single complete image. There are two basic options to achieve this goal: the first method is manual and the second is fully automatic. In the manual method, the user is required to establish four corresponding points in the two images. Theses four points give us eight linear equations, thus enabling us to solve the coefficients of the projective matrix. This method can yield good results, but relies on the proficiency of the user. In addition, the process of selecting the corresponding points may become very tedious when dealing with a large number of images. The fully automated method of finding the projective parameters consists of minimizing the sum of the squared intensity errors E over all corresponding pairs pixels i inside both images I(xi, yi) and I′共 x′i , y′i 兲. E=

兺 兩 I′共 x′, y′ 兲 − I共 x , y 兲 兩 = 兺e

2 i

2

i

i

i

i

i

i

(3)

151

To perform the minimization, we use the LevenbergMarquardt iterative nonlinear minimization. This algorithm requires computation of the partial derivatives of ei with respect to the unknown parameters {m0, m1, ... m7}. ⭸ei ⭸I′ ⭸x′ ⭸I′ ⭸y′ = + ⭸mk ⭸x′ ⭸mk ⭸y′ ⭸mk

(4)

From these partial derivations, the LevenbergMarquardt algorithm computes an approximate Hessian matrix A and the weighted gradient vector b with components: akl =

⭸ei ⭸ei , bk = − 2 k ⭸ml

兺⭸m i

⭸ei

兺e ⭸m i

i

(5)

k

And then updates the parameter estimate m by an amount: ∆m = (A + λ I)–1 b, where λ is a tuning parameter that is adjusted according to the change in the sum of squared differences at each stage. The complete algorithm consists of the following steps: – Choose a small start value for λ (current work used λ = 0.001); – For each pixel (xi, yi) do: • compute the transformed coordinate (共 x′i , y′i 兲) using the current estimate of the parameters mk; • compute the error in intensity for each pixel location; ⭸I′ and • compute the intensity gradients namely, ⭸x′ ⭸I′ at (共 x′i , y′i 兲) ⭸y′ ⭸ei • compute the partial derivatives ; ⭸mk • Add the pixel’s contribution to the Hessian and gradient matrices A and b. – Solve the system of equations (A + λ I) ∆m = b and obtain the increment: ∆m ← (A + λ I)–1 b; – do m ← m + ∆m; – If E has decreased do λ ← λ / 10, else λ ← λ × 10; – If λ = 10–14 stop, else go to 2. The above algorithm is quite robust, with λ falling to the minimum in about 25 iterations. Note that such algorithms tend to find a local minimum. It is therefore necessary to provide a ‘good’ first guess of the matrix coefficients for the algorithm to work [6]. 4.2. Stitching images together Now that we have the projective matrix between each two adjacent images of a planar scene, we can

152

A. Zarghili et al. / J. Cult. Heritage 2 (2001) 149–154

Figure 6. (a) Original image. (b) Spine form extraction. Figure 4. Some images of the sequence. (Images 1–4 in the first row and 14–17 in the second).

proceed and stitch them together into one image. For each image (except the first) in the scene, we will produce the projective matrix between it and the first image [7]. Stitching of all adjacent images is achieved by multiplying the matrices between the given image and the first image, in consecutive order: U0 = M01 M12 …M n-1 n UN Each image is then projected onto the first image using the above matrix. 4.3. Result To demonstrate the performance of the method developed above, we digitized an image sequence with a camera panning over a wall (see figure 4). Figure 5a shows the final mosaic of the wall with the constituent images outlined in black. Figure 5b shows the final mosaic with the location of a single image shown as a black outline. This mosaic is 800 × 1 000 pixels, based on compositing 38 images.

5. Geometric-based image retrieval Searching for similar image patterns in a large database can be conceptually visualized as a two step process. In the first step, a set of image processing operations is performed on the query pattern to compute a feature representation. The next step is to search through the database to retrieve patterns, which are similar in the feature space. The features could be color, shape, texture, or any other image attributes of interest. The goal of feature extraction is to transform the raw image data into a much lower dimensional space, thus providing a compact representation of the image. It is desirable that the computed feature vectors preserve the perceptual similarity. That is, if two image patterns are visually similar, the corresponding feature vectors are also close to each other. Image retrieval is then the problem of identifying the query image as (a part of) target images in the image database. A simple and effective retrieval scheme is to represent and match images on the basis of the spine’s shape of the decor. In this respect, retrieved images will have similar spine’s shapes with respect to those of query image. 5.1. Contour representation

Figure 5. Image mosaic example: (a) mosaic with component locations shown as black outlines. (b) Complete color mosaic (the black square in the top left shows the size of one input tile).

The shape of objects or of regions of interest is another important image feature for content-based image retrieval. Over the past few decades many approaches to the characterization of shape similarity have been proposed in the literature. An important class of shape analysis algorithms is based on the representation of the outer boundaries of an object [8]. In the context of our application, the shape feature is based on the spine extraction, which is achieved by a click into each vertex of the spine (see figure 6). Therefore, the spine is a contour of a 2D object; it is considered as a closed sequence of successive bound

A. Zarghili et al. / J. Cult. Heritage 2 (2001) 149–154

153

Figure 7. The object boundary represented as a discrete coordinate chain. (a) Original form. (b) A small portion of the outer boundary.

ary pixel coordinates (xS, yS ), where 0 ≤ s ≤ N – 1 and N is the total number of pixels on the boundary (see figure 7). In our experiment, a contour representation is derived from these boundary coordinates [5]. Let us denote γ the curve with length L and O the start or the reference point of the curve. S(u) is the curvature coordinates of point u (u ∈ [0,L]), i.e. the length of the distance scanned by u from the start point O. S(u) can be expressed as:

兰冐 u

S共 u 兲 = 1 L



dc共 1 兲 d1, d1

o

where c共 l 兲 = x共 l 兲 + iy共 l 兲

(6)

In order to ensure that the resulting shape features of all images’ objects in the database have the same length, the curvature function S of each object is resampled to M samples before performing the shape descriptors. 5.2. Fourier-based shape descriptors In the area of shape analysis and classification, several shape feature representation schemes based on autoregressive (AR) models [9] and Fourier descriptors [10–12] of contours have been proposed. An experimental comparison indicates that Fourierbased methods provide better performance than ARbased approaches, especially for noisy images [13]. For this reason we have chosen to use the Fourierbased shape description in our image retrieval system. Fourier transform of a contour representation generates a set of complex coefficients. These coefficients represent the shape of an object in the frequency domain, where lower frequency describes the general shape property and higher frequency denotes the

Figure 8. Some results of our image retrieval system.

shape details. The shape feature can be extracted from these coefficients. In order to achieve rotation invariance, we only use the amplitude information of the coefficients and discard the phase component. This allows the encoding of the contour to begin at any point along it. Note that translation invariance is obtained directly from the contour representation. The first non-zero frequency component is used to normalize the other coefficients. The shape feature vector is defined by: SFV =



兩 C− N/2 兩 兩 C1 兩

, .....,

兩 C− 1 兩 兩 C2 兩 兩 CN/2 兩 , , ....., 兩 C1 兩 兩 C1 兩 兩 C1 兩



(7)

where Ck denote the Fourier transform coefficients defined by: N/2−1 2兿kn − j c共 n 兲e N Ck共 c 兲 = 1 Nn=− N/2



5.3. Experimental result Figure 8 shows some of the image retrieval examples based on the spine shape similarity. A Euclidean metric is used to compute the distance between two shape feature vectors. As it is shown, the method is invariant for scale, translation and rotation.

6. Conclusion This paper presents a promising method to AraboMoresque decor images retrieval by content. The search is carried out from a pictorial specification

154

A. Zarghili et al. / J. Cult. Heritage 2 (2001) 149–154

defined by a spine shape extraction given by the user. The feature shape is based on the Fourier descriptors providing good retrieved images with invariance for scale, translation and rotation. Moreover, when capturing the scene image to be indexed, it is necessary in the context of our application to satisfy a good image resolution and to capture the whole spine. This motivates the use of a mosaicing technique which allows the stitching together of multiple, overlapping images of the decor.

References [1] Ma W.Y., Netra: A Toolbox for Navigating Large Image Databases, Ph.D., University of California Santa Barbara, June, 1997. [2] Mokadem A., Distances entre Formes: Application au Codage Orienté Objet, Thèse de Doctorat de l’Université de Compiegne, June, 1996. [3] Zappala A., Gee A., Taylor M., Document Mosaicing, Image and Vision Computing, No. 17, Elsevier Science, 1999, pp. 589–595. [4] Szeliski R., Video Mosaics for Virtual Environments, IEEE Computer Graphics and Applications, March, 1996, pp. 22–30. [5] Foley J.D., et al., Computer Graphics: Principles and Practice, 2nd edition, Addison-Wesley, Reading, MA, 1990.

[6] Szeliski R., Coughlan J., Hierarchical Spline-Based Image Registration, Proceeding of IEEE Computer Soc. Conf. Computer Vision and Pattern Recognition (CVPR’94), IEEE CS Press, Los Alamitos, CA, 1994, pp. 194–201. [7] Gadi N., Benslimane R., Maisel E., Bouatouch K., Nouvelle approche pour les corrections d’illumination dans les mosaïques d’images, 13ièmes Journées Annuelles de l’Association Française de l’Informatique Graphique AFIG Grenoble, 29/11-1/12/2000. [8] Van Otterloo P.J., A Contour Oriented Approach to Shape Analysis, Prentice Hall, 1991. [9] Dubois S.R., Glanz F.H., An Autoregressive Model Approach to two-dimensional Shape Classification, IEEE Trans. Pattern Anal. Machine Intell. 8 (1986) 55–66. [10] Zarghili A., Gadi N., Benslimane R., Indexation des Images de Décors basée sur les Descripteurs de Fourier, 6èmes Journées d’Études et d’Échanges: COmpression et REprésentation des Signaux Audiovisuels, CORESA’2000, Futuroscope Poitiers, France, October, 2000. [11] Persoon E., Fu K., Shape discrimination using Fourier descriptors, IEEE Trans. Syst. Man Cybern. 7 (1977) 170–179. [12] Zahn C.T., Roskies R.Z., Fourier descriptors for plane closed curves, IEEE Trans. Computer 21 (1972) 269–281. [13] Kauppinen H., Seppnäen T., Pietikäinen M., An experimental comparison of Autoregressive and Fourier-Based Descriptors in 2D Shape Classification, IEEE Trans. Pattern Anal. Machine Intell. 17 (1995) 201–207.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.