Semantic video object identification and extraction is an important component for content-based multimedia applications such as editing, coding and retrieval. A smart interactive video object generation (SIVOG) system based on adaptive processing and semantic user interaction was developed in our previous work. SIVOG is further improved to efficiently process video content based on semantic object's spatial and temporal characteristic in this work. The enhanced SIVOG system adaptively selects processing regions based on the object shape. Temporal skipping and interpolation procedures are applied to objects that have a slow motion activity. This system can extract simple semantic objects in real time with pixel-wise accuracy. Fast, accurate and consistent results are obtained in the experiment when the system is evaluated with several MPEG-4 test sequences.
A new smart interactive video object generation (SIVOG) system targeting at general semantic video object segmentation at pixel-wise accuracy is proposed. SIVOG identifies and extracts the semantic video object from image sequence with user interaction. The system consists of several basic components: semantic level user interaction, smart-processing kernel, object tracking, boundary update and error correction. It allows user to enter the semantic information with the least amount of mouse clicking and movement. Then, the user-input semantic information will be analyzed and interpreted in terms of low-lever features. Finally, the user can correct erroneous regions in the segmentation process. The proposed system is evaluated for several typical MPEG-4 test sequences.
A fast and robust video segmentation technique is proposed to generate a coding optimized binary object mask in this work. The algorithm exploits the color information in the L*u*v* space, and combines it with the motion information to separate moving objects from the background. A non-parametric gradient- based iterative color clustering algorithm, called the mean shift algorithm, is first employed to provide robust homogeneous color regions according to dominant colors. Next, moving regions are identified by a motion detection method, which is developed based on the frame intensity difference to circumvent the motion estimation complexity for the whole frame. Only moving regions are analyzed by a region-based affine motion model, and tracked to increase the temporal and spatial consistency of extracted objects. The final shape is optimized for MPEG-4 coding efficiency by using a variable bandwidth region boundary. The shape coding efficiency can be improved up to 30% with negligible loss of perceptual quality. The proposed system is evaluated for several typical MPEG-4 test sequences. It provides consistent and accurate object boundaries throughout the entire test sequences.
Video object segmentation is an important component for object-based video coding schemes such as MPEG-4. A fast and robust video segmentation technique, which aims at efficient foreground and background separation via effective combination of motion and color information, is proposed in this work. First, a non-parametric gradient-based iterative color clustering algorithm, called the mean shift algorithm, is employed to provide robust dominant color regions according to color similarity. With the dominant color information from previous frames as the initial guess for the next frame, the amount of computational time can be reduced to 50%. Next, moving regions are identified by a motion detection method, which is developed based on the frame intensity difference to circumvent the motion estimation complexity for the whole frame. Only moving regions are further merged or split according to a region- based affine motion model. Furthermore, sizes, colors, and motion information of homogeneous regions are tracked to increase temporal and spatial consistency of extracted objects. The proposed system is evaluated for several typical MPEG-4 test sequences. It provides very consistent and accurate object boundaries throughout the entire test sequences.
A digital color image quality metric is proposed in this work based on the characteristics of the human visual system. Chromatic coordinates are transformed from spectral cone absorption responses to the opponent-color space. The sensitivity thresholds in each of the color space axes are measured and visual masking models are provided and parameterized. Multiple contrasts are computed by the wavelet coefficients in their corresponding resolutions. The new objective error measure is defined as the aggregate contrast mismatch between the original and compressed images. Experimental results are given to show its consistency with human observation experience and subjective ranking.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.