KEYWORDS: Wavelets, Transform theory, Wavelet transforms, Video coding, Video, 3D video compression, Image processing, Video compression, 3D image processing, Wavelet packet decomposition
Three-dimensional (t+2D) wavelet coding schemes have been demonstrated to be efficient techniques for video
compression applications. However, the separable wavelet transform used for removing the spatial redundancy
allows a limited representation of the 2D texture because of spatial isotropy of the wavelet basis functions. In
this case, anisotropic transforms, such as fully separable wavelet transforms (FSWT), can represent a solution
for spatial decorrelation. FSWT inherits the separability, the computational simplicity and the filter bank
characteristics of the standard 2D wavelet transform, but it improves the representation of directional textures,
as the ones which can be found in temporal detail frames of t + 2D decompositions. The extension of both
classical wavelet and wavelet-packet transforms to fully separable decompositions preserve at the same time the
low-complexity and best-bases selection algorithms of these ones. We apply these transforms in t + 2D video
coding schemes and compare them with classical decompositions.
KEYWORDS: Wavelets, Video, Video coding, Wavelet packet decomposition, Image compression, Databases, Video processing, 3D video compression, Video compression, Nonlinear filtering
Wavelet packets provide a flexible representation of data, which has been proved to be very useful in a lot of applications in signal, image and video processing. In particular, in image and video coding, their ability to best capture the input content features can be exploited by designing appropriate optimization criteria. In this paper, we introduce joint wavelet packets for groups of frames, allowing to provide a unique best basis representation for several frames and not one basis per frame, as classically done. Two main advantages are expected from this joint representation. On the one hand, bitrate is spared, since a single tree description is sent instead of 31 per group of frames (GOP) - when a GOP contains, for example, 32 frames. On the other hand, this common description can characterize the spatio-temporal features of the given video GOP and this way can be exploited as a valuable feature for video classification and video database searching. A second contribution of the paper is to provide insight into the modifications necessary in the best basis algorithm (BBA) in order to cope with biorthogonal decompositions. A computationally efficient algorithm is deduced for an entropy-based criterion.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.