PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
Personal News Retrieval System is a client-server application that delivers news segments on demand in a variety of information networks. At the server side, the news stories are segmented out from the digitized TV broadcast then classified and filtered based on consumers' preferences. At the client side, the user can access the preferred video news through the Web and watch stored video news in preferred order. Browsing preferences can be set based on anchorperson, broadcaster, category, location, top- stories and keywords. This system can be used to set up a news service run by content providers or independent media distribution companies. However, in the news era of enhanced PC/TV appliances, it is foreseeable that the whole system can run in the living room on a personal device. This paper describes the chosen server architecture, limitation of the system and solutions that can be implemented in the future.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There has been significant progress in the area of content- based still image retrieval systems. However, most of the existing visual information system use static feature analysis models decided by database implementers based on heuristics, and adapting indexing oriented data modeling. In other words, such system fall short in a number of areas including scalability, extensibility and adaptability. In this paper, we will attempt to resolve the problems that surface in content modeling, description and sharing of distributed heterogeneous multimedia information. A language; named UCDL, for heterogeneous multimedia content description is presented to resolve the related problem. The resulting UCDL facilitates a formal modeling method of complex multimedia content, a unified content description scheme, and the exchange of heterogeneous content information. The proposed language has several advantages. For instance, an individual user can easily create audio- vidual descriptions by using a library of automated tools. Users can do automated testing of content description for correctness and completeness before populating the data base and its use. Note that with UCDL, content description becomes implementation independent, thus offering portability across a number of applications from authoring tools to database management systems. Users can have personalized retrieval view through content filtering, and can easily share the heterogenous content descriptions of various information sources.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The metaphor of film and TV permeates the design of software to support video on the PC. Simply transplanting the non- interactive, sequential experience of film to the PC fails to exploit the virtues of the new context. Video ont eh PC should be interactive and non-sequential. This paper experiments with a variety of tools for using video on the PC that exploits the new content of the PC. Some feature are more successful than others. Applications that use these tools are explored, including primarily the home video archive but also streaming video servers on the Internet. The ability to browse, edit, abstract and index large volumes of video content such as home video and corporate video is a problem without appropriate solution in today's market. The current tools available are complex, unfriendly video editors, requiring hours of work to prepare a short home video, far more work that a typical home user can be expected to provide. Our proposed solution treats video like a text document, providing functionality similar to a text editor. Users can browse, interact, edit and compose one or more video sequences with the same ease and convenience as handling text documents. With this level of text-like composition, we call what is normally a sequential medium a 'video document'. An important component of the proposed solution is shot detection, the ability to detect when a short started or stopped. When combined with a spreadsheet of key frames, the host become a grid of pictures that can be manipulated and viewed in the same way that a spreadsheet can be edited. Multiple video documents may be viewed, joined, manipulated, and seamlessly played back. Abstracts of unedited video content can be produce automatically to create novel video content for export to other venues. Edited and raw video content can be published to the net or burned to a CD-ROM with a self-installing viewer for Windows 98 and Windows NT 4.0.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video clip is the dominant component of multimedia system. However, video data are voluminous. An effective and efficient visual data management system is highly desired. Recent technology in digital video processing has moved to 'content-based' storage and retrieval. To detect meaningful area/region, using only production and camera operation- based detection is not enough. The contents of a video also have to be considered. The basic idea of this scheme is that if we can distinguish individual objects in the whole video sequence, we would be able to capture the changes in content throughout the sequences. Among many object features, motion content has been widely used as an important key in video storage and retrieval systems. Therefore, through motion- based representation, this paper will investigate an algorithm for sub-shot extraction and key-frame selection. From a given video sequence, first we segment the sequence into shots by using some of the production and camera operation-based detection techniques. Then, from the beginning of each shot, we calculate optical flow vectors by using complex wavelet phase-matching-based method on a pair of successive frames. Next, we segment each moving object based on these vectors using clustering in a competitive agglomeration scheme and represent them into a number of layers. After separating moving object(s) from each other for every frame in this shot, we extract sub-shots and select key-frames by using information about the presence and absence of moving object in each layer. Finally, these key-frames and sub-shots have been used to represent the whole video in panoramic mosaic-based representation form. Experimental results showing the significance of the proposed method are also provided.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Scene segmentation within a video is an important issue for easy and fast content-based access to on-line video database. In this paper, we introduce the classification of shots into 'Exterior shots' as a new clue with the aim to perform automatic scene segmentation within a video. Indeed, shots taking place outside or inside is a crucial information in the film grammar. Based on luminance intensity variation caused by the natural and artificial lights, our method detects and distinguishes exterior and interior shots. Our technique follows two steps. First, the luminance intensity of every image pixel is calculated by applying a linear image transformation from the CIE 3D color solid spaces RGB to the NTSC 3D perceptual chromaticity coordinates YIQ; we then analyze the maximum of minimum of luminance intensity values from a mosaic version of the image, leading to classify interior and exterior shots. Experimentation we have drawn so far shows that our technique leads to a successful classification rate up to 95 percent. The further work we have been undertaking shows that our method can also distinguish day and night lights within images. These techniques are being used for automatic scene generation within a video.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The video coding scheme defined by MPEG-4 standard offers several content-based functionalities, demanding the description of the scene in terms of so-called video- objects. The separate coding of the video objects may enrich the user interaction in several multimedia services due to flexible access to the bit-stream and an easy manipulation of the video information. In this framework, the coder may perform a locally defined pre-processing, aimed at the automatic identification of the objects appearing in the sequence. Hence, video segmentation is a key issue in efficiency applying the MPEG-4 coding scheme. This paper presents a segmentation algorithm based on watershed algorithm and optical flow motion estimation. Our simulation results show that this method is able to solve complex segmentation tasks according to luminance homogeneity and motion coherence criterion.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video database research is commonly concerned with the storage and retrieval of visual information invovling sequence segmentation, shot representation and video clip retrieval. In multimedia applications, video sequences are usually accompanied by a sound track. The sound track contains potential cues to aid shot segmentation such as different speakers, background music, singing and distinctive sounds. These different acoustic categories can be modeled to allow for an effective database retrieval. In this paper, we address the problem of automatic segmentation of audio track of multimedia material. This audio based segmentation can be combined with video scene shot detection in order to achieve partitioning of the multimedia material into semantically significant segments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
While previous research on audiovisual data segmentation and indexing primarily focuses on the pictorial part, significant clues contained in the accompanying audio flow are often ignored. A fully functional system for video content parsing can be achieved more successfully through a proper combination of audio and visual information. By investigating the data structure of different video types, we present tools for both audio and visual content analysis and a scheme for video segmentation and annotation in this research. In the proposed system, video data are segmented into audio scenes and visual shots by detecting abrupt changes in audio and visual features, respectively. Then, the audio scene is categorized and indexed as one of the basic audio types while a visual shot is presented by keyframes and associate image features. An index table is then generated automatically for each video clip based on the integration of outputs from audio and visual analysis. It is shown that the proposed system provides satisfying video indexing results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we describe music retrieval in ICOR, a project of Darmstadt TU. It is the goal of ICOR to find new interfaces to support applications of music video and music CDs. Although the project consists of audio and video analysis we concentrate on a description of the audio algorithms in this paper. We describe our MPEG-7 like data structure to store meta information for music pieces and explain which algorithms we use to analyze the content of music pieces automatically. We currently use an applause detection to distinguish live music from studio recordings, a genre classifier to distinguish pieces with beats form classical music, and a singer recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a study of the correlation of features automatically extracted from the audio stream and the video stream of audiovisual documents. In particular, we were interested in finding out whether speech analysis tools could be combined with face detection methods, and to what extend they should be combined. A generic audio signal partitioning algorithm as first used to detect Silence/Noise/Music/Speech segments in a full length movie. A generic object detection method was applied to the keyframes extracted from the movie in order to detect the presence or absence of faces. The correlation between the presence of a face in the keyframes and of the corresponding voice in the audio stream was studied. A third stream, which is the script of the movie, is warped on the speech channel in order to automatically label faces appearing in the keyframes with the name of the corresponding character. We naturally found that extracted audio and video features were related in many cases, and that significant benefits can be obtained from the joint use of audio and video analysis methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Modeling human shape similarity judgments involves identifying perceptually significant image elements, selecting appropriate features to represent their shape, and computing suitable similarity measures. This paper is concerned with the first of these - identification of the way in which humans segment abstract trademark images. A sample of 63 trademark images was shown to several groups of students from different subject backgrounds in two experiments. Students were first presented with printed versions of a number of abstract trademark images, and invited to sketch their preferred segmentation of each image. A second group of studies was then shown each image, plus its set of alternative segmentations, and invited to rank each alternative in order of preference. The degree of agreement over how images should be segmented varied substantially form one image to another. Qualitative analysis of our result suggested that participants used a relatively small number of segmentation strategies, reflecting well-known psychological principles. Agreement between human image segmentations and those generated by our ARTISAN trademark retrieval system was quite limited, indicating that ARTISAN is currently capable of modeling only a small subset of the mechanisms used by human participants. The implications of these experiments for the future development of ARTISAN are discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The ability to characterize the important features of images is vital when responding to queries directed to an image database. Ideally, to produce results that are satisfactory to a human user, the system should employ methods for characterizing the image that are similar to those used by the human visual systems. However, global mathematical techniques such as histogram or frequency-based transforms bear little resemblance to the early processing steps performed on the visual stream by the hum,na visual system, as it is passed from the eye into the brain. This paper presents a model that employs a sequence of spatial convolutions and thresholding operations to mimic the neural behavior of the visual pathway, including the bipolar cells, the horizontal cells, and the retinal ganglion cells of the retina, the Lateral Geniculate Nucleus and the simple cells of the primary visual cortex. hen this model is applied to an image, the result is a 2D pattern of excitation similar to that observed by neural scientist in the primary visual cortex of primates. Given the fact that the excitation pattern in the primary visual cortex is the basis for virtually all higher-level visual processing, this pattern represents the visual system's selection of the most important feature in the image. By processing this 2D pattern in ways similar to the later stages of the human visual system, an image archiving system could characterize an image in ways that are similar to the human visual system. To demonstrate one use for the model, this paper uses the patterns generated from images of several complex 3D textures to determine the direction of the light falling on each of them.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we present a method for retrieval of color images via query-by-example which takes into consideration human perception regarding the number of colors present in images being compared. First, we perform HSV-space segmentation to extract region of perceptually relevant color, to build a low-level color representation. This segmented result is then sued to build image indices by taking the representative color vector of each of the extracted color regions. For retrieval, we implement a vector-angular based measure and a perceptually-tuned membership function, instead of color histograms, which proves to provide results consistent with experimentally obtained human results. Through human testing, collected perceptual data governs how many colors two images can have in relation to each other so that the images can be considered for similarity calculations. The initial results show that there is a well-defined range for this color cardinality which increases as the number of colors increases.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we describe a unique new paradigm for video database management known as ViBE. ViBE is a browseable/searchable paradigm for organizing video data containing a large number of sequences. We describe how ViBE performs on a database of MPEG sequences.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we propose a tool for performing queries on image databases based on image content. The image content is saved as a feature set belonging to each image. The feature set is composed of color and location features but can easily be extended to include features like, texture, shape etc. The features are extracted by a new method which processes the histogram of each color component at different scales and merge the results to find dominant colors in the image. The database access and user interface is written using VBScript, ASP and the gateway to the database is achieved using ADO. One of the main advantages of this tool is the usage of ASP rather than Cgi-bin or JAVA.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The content-based image search and retrieval using high level features are very important but there are many difficulties to be solved. One of them is the feature extraction automatically which is greatly related to the finding the desired objects in images. As one of high level content based retrieval systems, we propose a system using high level features, especially human being information and give some results of the partially implemented. Since the human being information retrieval system have various applications, the trial of development of content based retrieval using human being information can be a contribution in this area.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The paper addresses visual similarity measuring and hierarchical groping of the most representative frames for the purpose of video abstraction. Our approach concentrates on measuring similarity of image regions. To produce a visual similarity measure we used as primary information the color histograms in the YUV color space. The difference from previous histogram based approaches is that we divide the input images into rectangles whose sizes depend on the local 'structure' of the image. We assume that similar regions in two different images would have approximately the same rectangle structure. Therefore, it should ge enough to compare the color histogram of the pixels within these rectangles in order to determine the similarity of two regions in two different images. We measure similarity between regions by a similarity score that is asymmetric. Such a measure cannot be used in classical clustering techniques for grouping representative frames. Our approach is therefore based on graph theoretic techniques. First, we construct an oriented weighted graph having as vertices the original set of key-frames. Next, we construct the set of weighted edges, according to the similarity values computed for each ordered pair of key-frames. Finally, we transform this graph into a collection of two-level trees, whose root key-frames form an abstract of the original ones. For graph construction and transformation, we present here two algorithms. The experiments we performed with the proposed technique showed improvements with respect to the way the visual content is represented. This conclusion is based on subjective assessment of the result groupings and the selection of the most representative key-frame.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An effective analysis of Visual Objects appearing in still images and video frames is required in order to offer fine grain access to multimedia and audiovisual contents. In previous papers, we showed how our method for segmenting still images into visual objects could improve content-based image retrieval and video analysis methods. Visual Objects are used in particular for extracting semantic knowledge about the contents. However, low-level segmentation methods for still images are not likely to extract a complex object as a whole but instead as a set of several sub-objects. For example, a person would be segmented into three visual objects: a face, hair, and a body. In this paper, we introduce the concept of Composite Visual Object. Such an object is hierarchically composed of sub-objects called Component Objects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a fast and robust automatic moving object segmentation system is introduced. The system tries to integrate the spatial and temporal segmentation results to separate moving objects from still background in video frames. For the spatial segmentation, the intensity histogram of each frame is smoothed using numerical diffusion algorithm, producing a multimodal probability density function consists of Gaussian kernels. Then, Laplacian operator is applied to the smoothed histogram in order to identify the dominant intensity, and the frame is segmented into separate regions with the dominant intensity. This new method can produce stable spatial region segmentation results with relatively cheap computational cost. According to the segmented regions, moving objects can be detected by a statistic scene segment the object boundaries accurately throughout the entire video sequence.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Objects and camera movements are important clues, which can be used for a better video and image understanding for content-based image and video indexing. In this paper, we propose a new general technique for object and 3D camera movement detection based on 2D hints extracted from 2D images. Our approach consists of extracting first 2D hints based on object contours and calculating their different order derivatives. We then apply our pattern matching method to obtain objects movement vectors, and with help of 3D projection theory, we detect the camera movement description in the 3D space. The further work we have been undertaking shows that 2D and 3D hints combined with movement vectors can lead to 3D scene description. Some experimental evidence is also provided.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present FaceTrack, a system that detects, tracks, and groups faces from compressed video data. We introduce the face tracking framework based on the Kalman filter and multiple hypothesis techniques. We compare and discuss the effects of various motion models on tracking performance. Specifically, we investigate constant-velocity, constant-acceleration, correlated-acceleration, and variable-dimension-filter models. We find that constant- velocity and correlated-acceleration models work more effectively for commercial videos sampled at high frame rates. We also develop novel approaches based on multiple hypothesis techniques to resolving ambiguity issues. Simulation results show the effectiveness of the proposed algorithms on tracking faces in real applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An image retrieval method is presented based on shape similarity measure for multimedia and imaging database system. In the proposed algorithm, the spatial and spectral properties of images are combined using the Radon transform, bispectra, and principal components analysis. For each model image in the database, the original 2D image data are reduced to a set of 1D projections via the Radon transform, and then a feature vector is calculated from the bispectra of the resultant 1D functions. The principal component analysis is applied to further reduce the dimension of the feature vector so that it can be stored along with the original image in the database at a small cost of memory. The derived feature vector is considered as the index or key of the corresponding image, which uniquely identifies the image independent of rotation, translation, and scaling. For image retrieval, the data feature vector is computed for a query image, and matched against the feature vectors of all the model images in the database using the Tanimoto similarity measure. The closely matching images are brought out as the searching results. The proposed technique has been tested on a large image database. The experimental results show that the retrieval accuracy is very high even for query images with low signal-to-noise ratio.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The use of image content analysis and image clustering techniques to organize an image database with a great variety of collections is investigated in this work. The objective is to bridge the gap between low-level features and their high level semantic meanings. We attempt this goal by using both coarse and fine classifications in image database organization. Image content analysis serves as the major tool in coarse classification. A set of typical image collections are studied by training their low-level feature vectors. Clusters of representative low-level features are further provided in form of semantic templates to provide fine-level classification clues for achieving a good query performance and serving as a supporting tool for browsing. With these multiple feature semantic templates, an interactive retrieval process can be conveniently implemented to incorporate user's feedback to achieve the desired query.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, to improve the retrieval effectiveness of a content-based image retrieval system, a shape-based object matching method is presented. A new skeleton structure is proposed as a shape representation. The skeleton structure represents an object in a hierarchical manner such that high-level nodes describe parts of coarse trunk of the object and low-level nodes describe fine details. Each low- level node refines the shape of the parent node. Most of the noise disturbances are limited to the bottom levels. The effect of boundary noise is reduce by decreasing weights on the bottom levels. To compute the similarity of two skeleton structures, we consider the best match of spine nodes, nodes in level one of the structure. Both moment invariants and Fourier descriptors are used to compute the similarities of sub-regions. We evaluated the retrieval accuracy and compared the result to that of other shape similarity measures. Experimental results showed that our system gives prominent accuracy in retrieval.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper introduces a new shape based image retrieval scheme. This scheme employs the image edge directional similarity in generating the feature vector. The previous comparable schemes consider the point edge angles without any regard for its neighboring point edges. Those schemes offer some advantages such as translation, scaling and rotation invariant. Their problems are high sensitivity to false edges or high computational barrier, and lack of considering spatial domain correlation in feature vector components. This new scheme has all the advantages of previous scheme, because it employs the edges point angles, but overcomes their drawbacks with considering the point edge directional similarity. This scheme uses two factors in producing the feature vector which improves the retrieval effectivity; the edge point angle and amplitude, and its relation with its neighboring edge points. This scheme is robust against noise, since noise has little effect on similar directional edges, consequently there is no need to do any extra computation to identify the false edges. This results in low computational cost of this scheme in comparison with other similar schemes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As more and more images are used in HTML documents, effective WWW image retrieval systems are required to locate relevant images. The aims of designing and developing such a retrieval system are to improve its retrieval performance so that it has high ability to retrieve relevant images and reject irrelevant images and to make it easy to use. We describe an approach integrating text based and content based techniques, to take advantage of their complementing strengths meet these two aims. Our experimental results show that the integrated approach has higher retrieval performance than either the text based or the content based techniques, and it is easy to start a search process.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Internet piracy has been one of the major concerns for Web publishing. In this study we present a system, RIME, that we have prototyped for detecting unauthorized image copying on the WWW. To speed up the copy detection, RIME uses a new clustering/hashing approach that first clusters similar images on adjacent disk cylinders and then builds indexes to access the clusters made in this way. Searching for the replicas of an image often takes just one IO to loop up the location of the cluster containing similar objects and one sequential file IO to read in this cluster. Our experimental results show that RIME can detect images copies both more efficiently and effectively than the traditional content- based image retrieval systems that use tree-like structures to index images. In addition, RIME copes well with image format conversion, resampling, requantization and geometric transformation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An image retrieval system based on an information embedding scheme is proposed. Using relevance feedback, the system gradually embeds correlations between images from a high- level semantic perspective. The system starts with low-level image features and acquires knowledge from users to correlate different images in the database. Through the selection of positive and negative examples based on a given query, the semantic relationships between images are captured and embedded into the system by splitting/merging image clusters and updating the correlation matrix. Image retrieval is then based on the resulting image clusters and the correlation matrix obtained through relevance feedback.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present an application designed to permit specification of, and search for spatio-temporal phenomenon in image sequences of the solar surface acquired via satellite. The application is designed to permit space scientists to search archives of imagery for well-defined solar phenomenon, including solar flares, search tasks that are not practical if performed manually due to the large data volumes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose a robust non-parametric algorithm for the segmentation of 1D histograms. This algorithm is based on the empirical distribution function of the data and the resulting partitioning of the histogram can be used to identify salient regions in images. Furthermore, we report on experiments where these extracted salient regions are used for region-based similarity matching.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It has been widely recognized that the difference between the level of abstraction of the formulation of a query and that of the desired result calls for the use of learning methods that try to bridge this gap. Cox et al. have proposed a Bayesian method to learn the user's performances during each query.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There have been tremendous technological advances in the areas of processors, mass storage devices, gigabit networks, and information capturing instruments over the past several years. These advances have made it feasible to access digital libraries and multimedia databases that contain large quantities of high-quality video, images, audio, and textural content by a much broader community through the Internet. The sheer volume of multimedia content, unlike Web pages which are almost exclusively indexed by text, prevents any single company or party from having the full knowledge of all the content available. Given the fact that multiple content repositories are emerging and it is safe to predict no two will be the same in terms of the ways analysis, query and retrieval being made, it is not hard to foresee challenges ahead in interoperability in a heterogeneous environment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The management of visual object databases is an essential function of visual information systems, and the efficient retrieval the associated data objects is vital to operation of these systems. In this paper, we use the number of visual objects retrieved per second as a measure of the throughput of visual object databases. An efficient storage organization technique for executing visual queries is studied: we group similar visual objects together as visual object groups and store each visual object group at consecutive physical locations. We propose a structured high-level indexing system that can cater for the similarity criteria employed in the application domain. Our system incorporates classification hierarchies into an indexing superstructure of metadata, context and content, using high- level content descriptions. Database performance is quantified using queuing analyses, and we show that our technique is able to significantly increase throughput and database performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Linear optimization queries appear in many application domains in the form of ranked lists subject to a linear criterion. Surveys such as top 50 colleges, best 20 towns to live and ten most costly cities ar often based on linearly weighted factors. The importance of linear modeling to information analysis and retrieval thus cannot be overemphasized. Limiting returned results to the extreme cases is an effective way to filter the overwhelmingly large amount of unprocessed data. This paper discusses the construction, maintenance and utilization of a multidimensional indexing structure for processing linear optimization queries. The proposed indexing structure enables fast query processing and has minimal storage overhead. Experimental result demonstrated that proposed indexing achieves significant performance gain with speedup like 100 times faster than linear scan to retrieve top 100 records out of a million. In this structure, a data record is indexed by its depth in a layered convex hull. Convex hull is the boundary of the smallest convex region contain a given set of points in a metric space. It is long known from linear programming theory that linear maximum and minimum always happen at some vertex of the convex hull. We applied this simple fact to build a multi-layered convex structure, which enables highly efficient query retrieval for any dynamically issued linear optimization criteria.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a novel video database system, which caters for complex and long videos, such as documentaries, educational videos, etc. As compared to relatively structured format videos like CNN news or commercial advertisements, this database system has the capacity to work with long and unstructured videos.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Media Server Issues: Compression, Storage, and Retrieval
We have proposed an algorithm to smooth the transmission of the pre-recorded VBR media stream. It takes O(n) time complexity, where n is large, this algorithm is not suitable for online resource management and admission control in media servers. To resolve this drawback, we have explored the optimal tradeoff among resources by an O(nlogn) algorithm. Based on the pre-computed resource tradeoff function, the resource management and admission control procedure is as simple as table hashing. However, this approach requires O(n) space to store and maintain the resource tradeoff function. In this paper, while giving some extra resources, a linear-time algorithm is proposed to approximate the resource tradeoff function by piecewise line segments. We can prove that the number of line segments in the obtained approximation function is minimized for the given extra resources. The proposed algorithm has been applied to approximate the bandwidth-buffer-tradeoff function of the real-world Star War movie. While an extra 0.1 Mbps bandwidth is given, the storage space required for the approximation function is over 2000 times smaller than that required for the original function. While an extra 10 KB buffer is given, the storage space for the approximation function is over 2200 over times smaller than that required for the original function. The proposed algorithm is really useful for resource management and admission control in real-world media servers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hypermedia applications use multimedia objects that are linked to each other via some application logic. Also, hypermedia applications can potentially become very storage intensive because of the nature of data in them. Hierarchical storage systems are an excellent way to store large hypermedia applications. In this paper we look, at the problem of placing multimedia objects in a hierarchical storage system so that their retrieval meets the needs of the application while at the same time keeping storage intensive data on cheaper storage. We use an abstract hierarchy approach along with an enhancement of the relationship management methodology to prose a framework for the design of hypermedia applications. The abstract hierarchy consists of a hierarchy of abstracts growing in size and linked together. The intrinsic nature of the abstract hierarchy makes it possible to map it easily to both the hierarchical storage system and the hypermedia application paradigm. In this framework, the designer can determine where each abstract of the application can be placed on the storage system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
FOr use of digital techniques for the production, manipulation and storage of images has resulted in the creation of digital image libraries. These libraries often store many thousands of images. While provision of storage media for such large amounts of data has been straightforward, provision of effective searching and retrieval tools has not. Medicine relies heavily on images as a diagnostic tool. The most obvious example is the x-ray, but many other image forms are in everyday use. Advances in technology are affecting the ways medical images are generated, stored and retrieved. The paper describes the work of the Image COding and Segmentation to Support Variable Rate Transmission Channels and Variable Resolution Platforms (ICoS) research project currently under way in Bristol, UK. ICoS is a project of the Mobile of England and Hewlett-Packard Research Laboratories Europe. Funding is provided by the Engineering and PHysical Sciences Research Council. The aim of the ICoS project is to demonstrate the practical application of computer networking to medical image libraries. Work at the University of the West of England concentrates on user interface and indexing issues. Metadata is used to organize the images, coded using the WWW Consortium standard Resource Description Framework. We are investigating the application of such standards to medical images, one outcome being to implement a metadata-based image library. This paper describes the ICoS project in detail and discuses both metadata system and user interfaces in the context of medical applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Compression and caching are two important issues for a large on-line image server. In this paper, we propose a new approach to compression by exploring similarity in large image archives. An adaptive vector quantization approach using content categorizations, including both the semantic level and the feature level, is developed to provide a differential compression scheme. We show that this scheme is able to support flexible and optimal caching strategies. The experimental results demonstrate that the proposed technique can improve the compression rate by about 20 percent compared to JPEG compression, and can improve the retrieval response by 5 percent to 20 percent under different typical access scenarios.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Lectures and similar presentations are increasingly available in digital format as tools for presenting and recording become rich in features and readily available. We discuss extensions to a representative of such tools that facilitate better retrieval functionality while keeping the system simple and the instructor focused on the presented material, rather than on adding the necessary meta- information by hand.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The considerations on effective lossless coding of non- smooth images are presented in this paper. Selection of the best not time consuming coding algorithms for a class of medical images is made a matter rather than completely new concept introduction. As a reference we consider the most efficient CALIC method, new lossless standard JPEG-LS and BTPC algorithm. Different methods of image scanning and 1D encoding are tested. Simple raster-scan data ordering followed by n-order arithmetic coding gives significant encoding efficiency for ultrasound images considered as a representative of the non-smooth image class. The lower bit rates could be achieved by additional statistical modeling in arithmetic coder based on the 12th order context quantized to one-order context. Therefore number of states in conditional probability model is reduced to overcome dilution problem. Finally, improved compression efficiency of non-smooth images in comparison to state-of-the-art CALIC algorithm is achieved. Average bit rate value is diminished over 30 percent. To compress smooth images the linear prediction scheme is incorporated for entire data redundancy reduction. The same model based on linear combination of adjacent pixels is used in prediction and entropy encoding steps. For smooth images our method performance is comparable to JPEG-LS and slightly worse than CALIC.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multimode database systems are becoming increasingly important as organizations accumulate more multimedia data. There are few solutions that allow the information to be stored and managed efficiently. Relational systems provide features that organization rely on for their alphanumeric data. Unfortunately, these system lack facilities necessary for the handling of multimedia data - things like media integration, composition and presentation, multimedia interface and interactivity, imprecise query support, and multimedia indexing. One solution suggested for storage of multimedia data is the use of object-oriented database management system as a layer on top of the relational system. The layer adds required multimedia functionality to the capabilities provided by the relational system. A prototype solution implemented in Java uses the facilities offered by JDBC to provide connection to a large number of databases. Java Media Framework is used to present the video and audio data. Among the facilities provided are image/video/audio display/playback and extension of SQL to include multimedia operators and functions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we present a novel content-based search application for petroleum exploration and production. The target application is specification of and search for geologically significant features to be extracted from 2D imagery acquired from oil well bores, in conjunction with 1D parameter traces. The PetroSPIRE system permits a user to define rock strata using image examples in conjunction with parameter constraints. Similarity retrieval is based multimodal search, an relies on texture-matching techniques using pre-extracted texture features, employing high- dimensional indexing and nearest neighbor search. Special- purpose visualization techniques allow a user to evalute object definitions, which can then be iteratively refined by supplying multiple positive and negative image examples as well as multiple parameter constraints. Higher-level semantic constructs can be created from simpler entities by specifying sets of inter-object constraints. A delta-lobe riverbed, for examples, might be specified as layer of siltstone which is above and within 10 feet of a layer of sandstone, with an intervening layer of shale. These 'compound objects', along with simple objects, from a library of searchable entities that can be used in an operational setting. Both object definition and search are accomplished using a web-based Java client, supporting image and parameter browsing, drag-and-drop query specification, and thumbnail viewing of query results. Initial results from this search engine have been deemed encouraging by oil- industry E and P researchers. A more ambitious pilot is underway to evaluate the efficacy of this approach on a large database from a North Sea drilling site.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As human factor studies over the last thirty years have shown, response time is a very important factor for the usability of an interactive system, especially on the world wide web. In particular, response times of under one second are often specified as a usability requirement. This paper compares several methods for improving the evaluation time in a content-based image retrieval system which uses inverted file technology. The use of the inverted file technology facilitates search pruning in a variety of ways, as is shown in this paper. For large databases and a high number o possible features, efficient and fast access is necessary to allow interactive querying and browsing. Parallel access to the inverted file can reduce the response time. This parallel access is very easy to implement with little communication overhead, and thus scales well. Other search pruning methods, similar to methods used in information retrieval, can also reduce the response time significantly without reducing the performance of the system. The performance of the system is evaluated using precision vs. recall graphs, which are an established evaluation method in information retrieval. A user survey was carried out in order to obtain relevance judgments for the queries reported in this work.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a new scheme to speed up fractal image compression based on HV partition. In this scheme, we propose a new approach based on character track that can speed up the process of searching appropriate domain blocks. In this technique we first abstract the range block's characters, then when we searching in the domain block pool, only domain blocks which characters are exactly fit with the range block could transact the process of matching. The experimental result shows that with this technique we can improve the speed of encoding more one hundred times than the original algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.