Image and Signal Processing Methods

Lossy compression of noisy remote sensing images with prediction of optimal operation point existence and parameters

[+] Author Affiliations
Alexander N. Zemliachenko, Sergey K. Abramov, Vladimir V. Lukin

National Aerospace University, Kharkov 61070, Ukraine

Benoit Vozel, Kacem Chehdi

University of Rennes 1, IETR UMR CNRS 6164, Lannion Cedex 22305, France

J. Appl. Remote Sens. 9(1), 095066 (Aug 11, 2015). doi:10.1117/1.JRS.9.095066
History: Received November 23, 2014; Accepted July 10, 2015
Text Size: A A A

Open Access Open Access

Abstract.  We address lossy compression of noisy remote sensing images, where the noise is supposed to be spatially uncorrelated (white), additive originally or after a proper variance-stabilizing transformation (VST). In such situations, the so-called optimal operation point (OOP) might exist. The OOP is associated with the parameter that controls compression (e.g., quantization step) for which the compressed image is the closest to the noise-free image according to a certain criterion and is closer than the original noisy image. Lossy compression in the neighborhood of OOP, if it exists, relates to an essential noise filtering effect and some distortions. Then such lossy compression (in the neighborhood of OOP) becomes expedient. However, it may be that OOP does not exist for a given image and the observed noise intensity. In such a situation, it can be reasonable to carry out a more “careful” image compression (with a lower compression ratio). Also, it is expedient to predict the existence of OOP and the compression parameters at this point in advance in order to perform adaptive and automated compression. The OOP existence that can be predicted for some coders based on the discrete cosine transform (DCT) is shown. The proposed prediction procedure is simple and fast. It presumes the calculation of DCT coefficient statistics in nonoverlapping 8×8pixel blocks for a given image and uses an approximating curve obtained in advance. It is shown that it is possible to predict values for both conventional metrics, such as mean square error or peak-signal-to-noise ratio, and some visual quality metrics for the coder parameters that correspond to a possible OOP. The designed prediction procedure is tested on Hyperion and AVIRIS hyperspectral remote sensing data.

Today, Earth monitoring systems find more and more applications.1,2 Most modern remote sensing (RS) systems are multichannel (e.g., full-polarization, multispectral or hyperspectral). On one hand, the multichannel character of the provided RS data increases their usefulness and information content. On the other hand, the amount of acquired data for multichannel sensors is large or even huge3 due to several factors. The first is the aforementioned number of sub-bands (channels). The second is the good spatial resolution of existing RS systems and/or a rather large area of imaged terrains.

For these reasons, such RS data often has to be compressed before the transferring downlink and its archiving in on-land centers of data processing or dissemination.36 Note that there are many data compression techniques that belong to two major groups of lossless and lossy compression. The best lossless compression techniques provide a compression ratio (CR) up to 4.5 and this happens only for hyperspectral images under a condition that spatial and spectral data redundancies are exploited to a full extent.36 However, a desired CR might be larger3,7 and then one would have to apply lossy compression methods.

For lossy compression of hyperspectral data, strategies and requirement priorities can be different.3 According to one strategy, the first-order task can be to provide a CR not less than the desired one; then compressed image quality is the second-order task. This paper concentrates on a different strategy associated with another set of requirements’ degradation. It is assumed that the first-order task is to have such “acceptable” losses that they do not result in hyperspectral data degradation,79 while it is also desired to provide a higher CR (second-order task).

A common assumption in design and testing lossy compression techniques is that compressed images are noise free.10,11 Used quantitative criteria such as the mean square error (MSE) or peak-signal-to-noise ratio (PSNR) that characterize compression are then calculated accordingly, i.e., for compressed and original images. RS and other images are often noisy, in fact, all real-life images are noisy, but noise can be invisible or visible by considered image inspection.1214 For hyperspectral data, noise is visible or, at least, can be noticed in about 20% of the sub-band images.12,13 The noisiest sub-bands are sometimes removed from consideration, but some of these sub-bands contain useful information which is worth preserving and retrieving.14

The noise presence in conventional and RS images is often not taken into account in lossy compression. Al-Shaykh and Mersereau15 were the first to demonstrate the peculiarities of noisy image lossy compression and to stress the necessity of taking noise and its properties into account.

First, Al-Shaykh and Mersereau15 outlined the phenomenon of image denoising (filtering effect) due to lossy compression. Certainly, this denoising is specific and is less efficient compared to the best image denoising techniques,16 but the partial effect of noise filtering can be regarded as a favorable phenomenon. In particular, it can lead to improving hyperspectral RS data classification accuracy.8,9

Second, it has been stressed in Ref. 15 that it is worth carrying out an analysis of noisy image lossy compression using untraditional quantitative criteria (or criteria determined in an unconventional way). While performing simulations [when one has true (noise-free), noisy, and compressed noisy images], it is worth analyzing metrics calculated for compressed and true images. Such metrics may have values which are better than the corresponding metric calculated for noisy (uncompressed, compressed in a lossless manner) and true images. In other words, a compressed image under certain conditions1519 can be better (closer to the noise-free image according to a considered metric) than the noisy image. It can take place for different techniques of lossy image compression, e.g., based on discrete cosine transform (DCT)17,18 and wavelets,19 for different noise types15,17,19 and for different used metrics (criteria).9,20

If a compressed image can be closer to the true image than the noisy one, then a used metric dependence on a parameter that controls compression (PCC) can have an extreme [it is a minimum in the case of, e.g., mean square error (MSE) or a maximum in the case of peak signal-to-noise ratio (PSNR)]. This extreme in a wide sense is called OOP. In a narrow sense, OOP is associated with PCC that provides the extreme and it can be, for example, quantization step (QS), scaling factor (SF) or bits per pixel (bpp), depending on an applied coder.15,17,19

Lossy compression of noisy images in OOP has certain advantages. On one hand, it is possible to expect that the provided CR is considerably larger than for lossless compression. On the other hand, RS data compressed in OOP can be classified better than the original data or data compressed in a lossless manner.8,9 There are certain problems in compressing noisy images, in general, and hyperspectral images, in particular, to demonstrate that OOP exists. First, true images are not available in practice. This does not allow considering dependences that determine OOP. Fortunately, there are methods that allow rough estimation of PCC at OOP and provide compression in the OOP neighborhood.17,18,20 For example, for DCT coders, this can be done by setting the QS or SF proportional to the noise standard deviation with a properly chosen proportionality factor.18,20 Second, noise in hyperspectral images is rather complicated.2123 One often has to take into account the signal-dependent noise component2224 and its different intensity. Fortunately, there are several ways to cope with this, including blind estimation of noise characteristics,23,24 data normalization in sub-bands before compression,25 and application of VSTs.22,26 Third, OOP does not always exist,15,20 in fact, it exists if noise is rather intensive and the image is not textural (complex). This fact has been ignored earlier in the methods described in literature21,2527 for which the OOP existence has been supposed for any sub-band image. Such compression can introduce losses of an undesired level in compressed data.

Our idea is the following.28,29 If OOP exists for a given (sub-band) image, it is worth carrying out compression in an OOP neighborhood. If OOP does not exist, it is worth performing a more careful compression (introducing fewer losses). To realize this idea, it is necessary to predict the OOP existence. Such a prediction can be helpful in practice if it is fast and accurate (reliable).

Our recent paper29 demonstrated that there is an approach that works rather well in predicting OOP existence and the corresponding compression parameters for lossy noisy image compression using four coders. There are several aspects not considered in this paper29 and the obtained results have some shortcomings. First, a limited number of test images (12) and noise standard deviation values (3) have been used to obtain prediction dependences by curve fitting. Second, fitting has not been considered in detail. Third, the approach has not been tested on real-life RS data. Thus, the present paper can be regarded as an extended version of Ref. 29, where the aforementioned aspects are paid special attention. Its main novelty consists in a specific adaptation of PCC to image/noise properties in hyperspectral data sub-bands. Another contribution is in method verification for real hyperspectral data.

The paper is organized as follows: Sec. 2 revisits the used image-noise model and quantitative criteria (metrics) of lossy compression applied componentwise (or to a single-channel image). OOP and filtering effects at this point are discussed. This section also briefly describes the considered DCT-based coders. Section 3 shows the obtained numerical simulation results for lossy compression of test noisy images. Aspects of curve fitting are discussed and improvements (compared to Ref. 29) are shown. Comparison of coder performance is done. Section 4 contains practical details for componentwise noisy image compression. Results for hyperspectral images after VST are presented. Then the conclusions and directions of future research are given in Sec. 5.

Our further research is based on several assumptions. First, we concentrate on considering lossy compression of single-channel images or componentwise compression of multichannel (hyperspectral) data. Clearly, joint [three-dimensional (3-D)] compression of hyperspectral data is able to provide better CR,5,7,21,27 but it is necessary to understand from the beginning how to set the PCC for each (sub-band) image. Second, a simple case of additive white Gaussian noise is considered. Here, one has to keep in mind that if noise is signal dependent (Poissonian, additive and Poissonian, additive and multiplicative, etc.), there is a VST able to convert such noise to the additive one.22,27,30 This is possible if parameters of signal-dependent noise are estimated with appropriate accuracy and modern existing methods.23,24 Third, major attention is paid to consider DCT-based coders for which it is simpler to provide compression in an OOP neighborhood17,18,20,21 and for which we are more experienced in their design and performance analysis. Finally, we will focus on analyzing two metrics: conventional PSNR and visual quality metric PSNR-human visual system (HVS)-masking (M).31 It might seem that these metrics (especially the latter one) do not have a direct relation to RS data processing. However, one has to take into account the following. On one hand, the criteria used in RS data compression are application oriented.8,32 On the other hand, metrics exploited in compression of conventional images are also used in lossy compression of hyperspectral images and this relates to HVS metrics as well.33 One reason is that HVS metrics highly correlate with such important properties as edge, detail, and texture feature preservation by lossy compression techniques.

Then the noisy image model in a n’th sub-band image is Display Formula

Iijnoisy(n)=Iijtrue(n)+ηij(n),(1)
where ij denotes image indices in the spatial domain, ηij(n) is zero mean additive white and Gaussian (AWGN) with variance equal to σ02(n), Iijtrue(n) denotes the true image value in the ij’th pixel of the n’th sub-band image; i=1,,IIm; j=1,,JIm, IIm and JIm determine the image size. As has been noted above, even if Eq. (1) is not valid for original images, it becomes adequate after the corresponding VSTs. Besides, let us start from considering single-channel images supposing the componentwise processing of multichannel images. Hence, let us omit the index n in this and the next sections.

In fact, only the noisy image Iijnoisy, i=1,,IIm; j=1,,JIm is available in practice. Usually, for analysis purposes, one has a true image Iijtrue, i=1,,IIm; j=1,,JIm and noise is generated and added artificially. After applying a considered lossy compression technique to the noisy image, a compressed image is obtained, which will be further denoted as Iijc, i=1,,IIm; j=1,,JIm.

Then having all three images, it is possible to calculate any metric Metr for any pair of them. Recall that only a metric Metrnc for the pair of images {Iijnoisy} and {Iijc} can be calculated in practice. But in simulations, it is also possible to calculate a metric Metrtc for the images {Iijtrue} and {Iijc}. For any metric Metrnc, its dependence on CR is clear. It becomes worse for any coder if CR increases (bpp reduces, QS or SF increases). The reason is that the introduced distortions characterized by Metrnc increase if CR becomes larger.

The behavior of a metric Metrtc can be specific.15,20 It might happen that the metric Metrtc has an extreme (optimum) for a certain PCC of a used coder. Then one might have bppOOP or QSOOP or SFOOP. To get some ideas of dependences and OOP that might be observed for them, let us present several typical dependences. For this purpose, consider three test images, namely, Airfield, Aerial, and Frisco [see Figs. 1(a)1(c), respectively]. These images differ sufficiently from each other in complexity, where we associate complexity with the percentage of pixels that belong to heterogeneous image regions.

Graphic Jump Location
Fig. 1
F1 :

The noise-free test images (a) Airfield, (b) Aerial, and (c) Frisco; (d) noisy and (e) compressed image Frisco.

The AWGN has been added with two variance values equal to 100 and 200 to these three images and they have been compressed by the coder AGU.34 This coder uses DCT in 32×32pixel blocks and deblocking after the decompression. Recall that QS serves as PCC for this coder. The behavior is controlled by two quality metrics: conventional PSNRtc and PSNR-HVS-Mtc, both determined between the true and compressed images. Both quality metrics are expressed in dB and their larger values relate to a better quality. The obtained dependences are presented in Fig. 2.

Graphic Jump Location
Fig. 2
F2 :

Dependences (a) PSNRtc(QS) and (b) PSNR-HVS-Mtc(QS) for three test images and two variances.

As is seen [Fig. 2(b)], four out of six curves PSNR-HVS-Mtc(QS) are monotonously decreasing. According to this metric, the quality of compressed images steadily reduces if QS increases and, respectively, CR increases as well. This means that for these test images and noise intensities, OOP does not exist and a trade-off between the reduction of compressed image quality and the reached CR should be found. Meanwhile, there are also two dependences in Fig. 2(b) that have maxima (both are observed for the test image Frisco that has a simple structure).

The curves PSNRtc(QS) in Fig. 2(a) are somewhat specific. Five out of six curves have OOPs according to its definition, i.e., the extreme metric value is better than for the uncompressed (original, noisy) image—this value is practically the same as for QS=1. One dependence has a local maximum in the neighborhood of QS about 40 (the image Airfield, σ02=100, QS4.0σ0). Formally, this is not OOP since PSNR(QS=1) is larger than PSNR for QS4.0σ0.

Analysis of the dependences, PSNRtc(QS) and PSNR-HVS-Mtc(QS), presented in Fig. 2 shows the following. If OOP is observed, it happens when QSOOP(3.54.0)σ0. The same observations can be found for the coder AGU applied to other images and noise variance values in the paper.20 The positions (arguments) of OOP slightly differ depending upon an image and noise variance, but QSOOP(3.54.0)σ0 can be considered as the recommended value to attain OOP (if it exists) for the AGU coder.20 This means that the compression in OOP or its close neighborhood can be provided without having a noise-free image if σ0 is known in advance or accurately pre-estimated.

It has already been mentioned that OOP appears due to the filtering effect provided by lossy compression applied to noisy images. Let us explain why it happens and give one illustration. For the very beginning, recall how DCT-based filtering is carried out.35,36 The basic operations are direct DCT in blocks, DCT coefficient thresholding (assigning them zero values if the magnitudes are smaller than a set hard threshold), and inverse DCT. An additional operation is filtered value averaging for overlapping blocks. Noise removal is achieved due to the thresholding of DCT coefficients with small magnitudes, which are supposed to be related to noise.

DCT-based lossy compression14,34,37 has similar operations at the initial processing stages. Direct DCT in blocks is carried out. Then DCT coefficient quantization is applied. For DCT coefficients with small magnitudes, this quantization (which leads to zero values after performing it) is very similar to hard thresholding and this explains the filtering effect for lossy compression. The quantization of larger magnitude DCT coefficients introduces distortions; besides, blocks used in lossy compression are usually nonoverlapping. These two factors explain why the filtering effect caused by lossy compression is less efficient than the corresponding orthogonal transform based denoising. The explanation for wavelet-based compression and denoising is almost the same. OOP exists if the positive effect of noise removal is more pronounced than the negative effect of losses introduced by quantization.

An example of the filtering effect is shown in Figs. 1(d) and 1(e). The noisy image Frisco (σ0=10) is presented in Fig. 1(d); noise is observed clearly especially in homogeneous image regions. The compressed image (AGU coder with QS=3.5σ0=35, CR=11.34) is represented in Fig. 1(e). The noise suppression effect is obvious and the information content is sufficiently preserved.

The dependences presented in Fig. 2 and the example given in Fig. 1 indicate that OOP might or might not exist. Thus, it is expedient to predict that does OOP exists for a given coder, metric, image, and noise variance (assumed known or pre-estimated in advance). This is quite a general task that is addressed here. Since there is a wide variety of image compression techniques and coders, our “narrower” task is to demonstrate that the prediction task can be solved for two metrics (conventional and visual quality) and several DCT-based coders with QS or SF serving as the PCC. The proposed approach, although based on DCT, could be useful for wavelet-based coders (this has not been checked yet).

It has been shown earlier15,17,20 that OOP might exist for different coders, in particular, for the standard JPEG, for the DCT-based AGU34 and advanced DCT (ADCT)37 coders with QS serving as PCC, and for the wavelet-based JPEG2000 or SPIHT coders with CR controlled by bpp.3840 OOP might also exist for coders with SF used as PCC, and the examples are the recently proposed modifications of AGU and ADCT called AGU-M and ADCT-M.41 Below, the following four DCT-based coders are considered: AGU,34 ADCT,37 AGU-M, and ADCT-M.41 The AGU coder has been briefly described above. The ADCT coder uses DCT in blocks of different sizes obtained as the result of partition scheme optimization. The ADCT coder outperforms AGU as well as JPEG2000 and SPIHT37 (see also Ref. 42). Both AGU and ADCT employ uniform quantization of DCT coefficients. Similar to AGU, ADCT provides compression in the OOP neighborhood for QS3.5σ0.20

The modifications of these coders called AGU-M and ADCT-M have only one difference with respect to their counterparts. They employ different quantization for different DCT coefficients (similar to the standard JPEG). The quantization factors are calculated as SF multiplied by the quantization table (matrix). For AGU-M, there is a fixed 32×32pixel matrix. For ADCT-M, there are matrices for all possible sizes of the blocks obtained by the partition scheme. The coders, AGU-M and ADCT-M, were designed to improve compressed image visual quality. Our studies have shown that to get into an OOP neighborhood, one has to apply the fixed SFOOP equal to 2.0σ0.

Thus, the recommendations of how to set QSOOP or SFOOP if OOP exists can be given. If OOP does not exist, QS or SF should be smaller. This will be analyzed later more in detail.

Let us discuss what can be indicators of the existence of OOP. One standard metric frequently used in lossy compression analysis is MSE, defined as Display Formula

MSEtc=i=1IImj=1JIm(IijtrueIijc)2/(IImJIm).(2)

If OOP (according to this metric) exists, then MSEtc has to be smaller than σ02. This can be also treated in another manner—the ratio MSEtc/σ0 has to be less than unity in OOP and its neighborhood.

Another standard metric is PSNR which is strictly connected with MSEtc as Display Formula

PSNRtc=10log10[(ImaxImin)2/MSEtc],(3)
where Imax, Imin are the maximal and minimal image values (the difference ImaxImin is assumed to be fixed and equal to 255 if one deals with 8-bit images). In a similar manner, for the original (noisy) image, one has Display Formula
PSNRor=10log10[(ImaxImin)2/σ02].(4)

Additionally, we propose using a parameter that can be called improvement or the difference of PSNR defined as Display Formula

ΔPSNR=PSNRtcPSNRor=10log10(σ02/MSEtc).(5)

Positive ΔPSNR (expressed in dB) in the neighborhood of (possible) OOP indicates that OOP exists and it is expedient to carry out compression with the recommended PCC. Otherwise, if MSEtc/σ0 is larger than unity or ΔPSNR is smaller than zero, then in our opinion, compression with a smaller CR is desired.

The decision on the existence of OOP can also be based on an analysis of metrics other than MSE or PSNR. HVS-metrics have become popular.32,43,44 Therefore, let us consider the visual quality metric PSNR-HVS-M.31 This metric takes into consideration two important peculiarities of the HVS: different sensitivity to distortions in low and high spatial frequencies as well as masking effects. For characterizing the visual quality of single-channel images, PSNR-HVS-M is among the best.45 A larger PSNR-HVS-M relates to a better visual quality where the values of PSNR-HVS-M over 40 dB indicate that distortions are practically invisible46 (this property will be exploited below).

The metric PSNR-HVS-M can be determined for the pair of true and compressed images (then one obtains PSNR-HVS-Mtc) as well as for the pair of original (noisy) and compressed images (then PSNR-HVS-Mnc is obtained). The general equation is Display Formula

PSNR-HVS-M=10log10[(ImaxImin)2/MSEHVS],(6)
where MSEHVS denotes the MSE calculated in the DCT domain, taking into account the aforementioned peculiarities of HVS. Similar to ΔPSNR, let us consider Display Formula
ΔPHVS=PSNR-HVS-MtcPSNR-HVS-Mnc.(7)

This parameter is expressed in dB; OOP is supposed to exist if ΔPHVS is positive.

Before starting to consider the prediction of the OOP existence, we would like to note the following. Recently, approaches to predict the denoising efficiency for two DCT-based filters and the metrics ΔPSNR and ΔPHVS have been proposed and tested.4749 For carrying out such predictions, one needs two very simple operations. First, some simple statistic parameter has to be calculated for a given image. Calculations are made in nonoverlapping 8×8pixel blocks of a source noisy image to obtain one of two probabilities: either probability P2σ that absolute values of DCT coefficients in blocks are less than 2σ0 or probability P2.7σ that absolute values of DCT coefficients exceed 2.7σ0. The second operation is to calculate a predicted metric value using the available analytical dependence of this metric on P2σ or P2.7σ. Such analytical dependences may be obtained for a considered metric in advance. Using a set of test images and different values of noise variances, the dependences of the considered metrics on P2σ (and on P2.7σ) are obtained by fitting curves to the corresponding scatter plots.

Keeping in mind similarities between denoising and the lossy compression of noisy images, a similar approach has been proposed to predict the OOP existence and its compression parameters in our paper.29 The advantage of this approach is its high computational efficiency. It uses DCT in 8×8pixel blocks that is a standard operation in image processing.10 Moreover, blocks are nonoverlapping and for large size images, it is enough to have 500 nonoverlapping blocks chosen randomly to estimate P2σ or P2.7σ with an appropriate accuracy.48,49 All other operations needed to calculate P2σ or P2.7σ are very simple (logical and arithmetic).

Recall the basic aspect of the prediction approach. We assume that some statistical parameter is calculated for a given image. The requirements for this parameter are the following. First, it should be simple enough and computed quickly (desirably faster than the compression, see discussion at the end of Sec. 2). Second, these parameters should be able to “jointly describe” image complexity and noise intensity. This requirement assumes that the probability of OOP existence increases for a larger noise intensity and lower complexity of a compressed image. Third, the prediction has to be carried out with appropriate accuracy. We mean here that the STD of the prediction error should be small, about 0.2 dB, for the metrics ΔPSNR and ΔPHVS.

We do not know at the moment what statistical parameter is the best for our purpose and try not to invent new statistical parameters that can be used in the prediction (this is outside the scope of this paper). Therefore, let us analyze the applicability of statistical parameters that have been already used with a certain success.47 Let us use the aforementioned probabilities P2σ and P2.7σ. More formally, the probability P2σ is defined as the mean of the local probabilities estimated for the considered image blocks as Display Formula

2σ={[k=07l=07δ2σ(p,q,k,l)]δ2σ(p,q,0,0)}/63δ2σ(p,q,k,l)={1,if|B(p,q,k,l)|2σ00otherwise,(8)
where B(p,q,k,l) denotes the kl’th DCT coefficient calculated for the block of 8×8pixels in which the left upper corner corresponds to the image pq’th pixel.

Similarly, the probability P2.7σ is estimated as the mean of local probabilities in image blocks: Display Formula

2.7σ={[k=07l=07δ2.7σ(p,q,k,l)]δ2.7σ(p,q,0,0)}/63δ2.7σ(p,q,k,l)={1,if|B(p,q,k,l)|>2.7σ00otherwise.(9)

Consider now the methodology of obtaining and processing simulated data. In Ref. 29, 12 grayscale test images have been used. Eight of them can be considered as the standard optical test images: Baboon, Stream&Bridge, Man, Lenna, Tiffany, F16, Peppers, Boats. These images have different complexity starting from rather simple ones as Lenna and ending with the complex structure (textural) test image Baboon. Four other test images have more relation to RS. They are the images Frisco, Aerial, and Airfield, as shown in Fig. 1, and a quite popular image San Diego. The results presented in Ref. 29 have already shown that the observed tendencies are the same for both groups of images. The same tendencies for two groups of the test images have been earlier found in Ref. 48. Thus, the image content does not influence the results. The main requirement is to have a rather large number of test images with properties varying in wide limits.48

Another aspect is what levels of noise to study. In Ref. 29, we considered three noise levels characterized by variances equal to 25, 100, and 225. This makes the probabilities P2σ and P2.7σ vary in the limits from 0.53 to 0.93 and from 0.03 to 0.35, respectively. Meanwhile, both probabilities can potentially vary from 0 to 1. Thus, since different values of these probabilities can be met in practice, we had to consider wider limits of probability variation in the simulated noisy images. For this purpose, we have also added noisy images corrupted by AWGN with σ0 (STD) equal to 0.5, 1, 2, and 20.

Let us present the results obtained for rather small noise STD values (0.5, 1.0, and 2.0) that are at the same level as the “internal noise” STD of the test images. The simulation data are given for the AGU coder in Table 1. The values of P2σ vary in the limits from 0.07 (very close to zero) to over 0.6 (for simple structure test images if σ0=2.0). In turn, the values of P2.7σ vary from 0.23 to about 0.9 (i.e., almost unity for σ0=0.5). The values of MSEtc/σ0 are larger than unity (are close to 2) and ΔPSNR is about 3dB, respectively, for almost noise-free images (if STD equals to 0.5 or 1.0). Only if the noise STD is equal to 2, might MSEtc/σ0 differ from 2 and be sufficiently smaller (this happens for the simplest structure images).

Table Grahic Jump Location
Table 1Simulation results for compression in optimal operation point (OOP) neighborhood for the AGU coder for small STD values.

Analysis of ΔPHVS values shows that they are even smaller than ΔPSNR and are from 2.5 to 4.6dB. Obviously, the visual quality of images compressed in the considered way (with QS equal to 3.5σ0) does not improve. Moreover, according to the analyzed metrics, the visual quality becomes worse. However, an interesting fact is that for all cases considered in Table 1, the introduced distortions are not visually seen46 since PSNR-HVS-Mnc exceeds 47 dB (see data in the lines for each STD in Table 1). CR values in these cases are small (see CR values in the corresponding column). For σ0=0.5, CR is about 2.0 and it does not exceed 4.5 for σ0=2.0 (when QS=3.5σ0=7) and simple structure images. Thus, for these three values of noise variance, one practically deals with near-lossless image compression.7

Let us now present data for four other values of noise STD: 5, 10, 15, and 20. They are given in Table 2 and grouped for noise STD. For each group, we put the images in such an order that the values of P2.7σ (they are in the second column) are reduced.

Table Grahic Jump Location
Table 2Simulation results for compression in OOP neighborhood for the AGU coder for middle and large STD values.

Analysis of data presented in Table 2 shows the following:

  1. Now the values of P2σ and P2.7σ (together with data in Table 1) practically cover the entire interval of their possible variation (from almost zero to almost unity).
  2. In each group, there is a tendency of increasing the probability P2σ and decreasing the probability P2.7σ. This is not surprising since probabilities P2σ and P2.7σ are mutually dependent.
  3. The values of P2σ are the smallest and the values of P2.7σ are the largest for the most complex images corrupted by nonintensive noise (this has also been seen for data in Table 1). Accordingly, P2σ increases and P2.7σ decreases for simple structure images and/or for intensive noise.
  4. This means that both considered statistical parameters are able to jointly indicate noise intensity and image complexity. This is the property we need from statistical parameters to be used in prediction.
  5. OOP most probably does not exist for complex structure images corrupted by nonintensive noise (then MSEtc/σ0 is larger than unity, ΔPSNR and ΔPHVS are negative). Then one has to decide what QS to set (one possible decision could be to set QS smaller than QS=3.5σ0).
  6. According to the metrics MSEtc/σ0 and ΔPSNR, there is a large probability that OOP exists if P2σ exceeds 0.81 or P2.7σ is less than 0.11. More insight will be given to this threshold later.
  7. According to ΔPHVS, OOP exists with a high probability if P2σ exceeds 0.88 or P2.7σ is less than 0.05; this happens only for simple structure images and if noise is intensive.
  8. Obviously, OOP (ΔPSNR reaches a few dB and ΔPHVS exceeds 1 dB) occurs only for simple structure images and high intensity noise.
  9. Noise in original images is visible and distortions due to lossy compression are visible as well; this happens if PSNR-HVS-Mnc is of the order 30 to 40 dB and ΔPHVS is from 1 to 3dB; this takes place for P2σ in the limits from 0.55 to 0.85 or for P2.7σ in the limits from 0.06 to 0.35.
  10. While for σ0=5, CR varies in the limits from 3.86 to 7.79, it is considerably larger for σ0=20 and can be larger than 20. This is due to a large QS value being used.

Data in Tables 1 and 2 could be presented in another manner. In particular, it was also possible to separately put data for each test image. Such a study is also possible by a joint analysis of data in Tables 1 and 2 for any particular test image. Consider, for example, the test image Airfield. If σ0 equals to 0.5, 1, or 2 (see data in Table 1), P2σ is small, MSEtc/σ0 is about 2.0 and, respectively, ΔPSNR is about 3dB. The metric ΔPHVS is about 4dB, i.e., formal lossy compression leads to image quality degradation. However, the quality of the original images is so high that noise in them cannot be noticed by visual inspection. Introduced losses cannot be noticed as well. Hence, it is possible to ignore these losses. For σ0=5 (see Table 2), we have P2σ=0.60 and noise becomes visible. MSEtc/σ0 is still larger than unity, but starts decreasing, and ΔPSNR and ΔPHVS are both negative. This means that lossy compression leads to considerable (visually noticeable) image quality degradation and it is desirable to avoid or to diminish this effect. Similar effects are observed for σ0=10, one has quite large P2σ=0.78 but MSEtc/σ0 is still larger than unity, while ΔPSNR and ΔPHVS are both negative. This means that the compression with QSOOP=3.5σ0 is undesired. The situation starts radically changing for σ0=15 (P2σ=0.84). The ratio MSEtc/σ0 occurs to be less than unity, ΔPSNR is positive while ΔPHVS is still negative. Thus, OOP exists, at least for the metrics MSEtc/σ0 and ΔPSNR. Finally, for σ0=20, we have an obvious OOP according to the metrics MSEtc/σ0 and ΔPSNR.

The observations and conclusions for other test images are similar. If σ0 increases, P2σ becomes larger, P2.7σ reduces, MSEtc/σ0 decreases, ΔPSNR, ΔPHVS, and CR increase. Because of these, it is more probable that OOP exists.

Such dependences are observed not only for the 8-bit images considered above. Suppose that one has multiplied all values of Iijn, i=1,,IIm; j=1,,JIm and Iijtrue, i=1,,IIm; j=1,,JIm by some constant C>1, then more than 8 bits are needed to represent data, noise standard deviation, dynamic range, and QSOOP increase by C times, P2σ, P2.7σ, ΔPSNR, and ΔPHVS do not change. Thus, the conclusions are valid for other than 8 bit data representations as well.

Consider now other approaches to the aggregated presentation of the obtained data. Figure 3 shows the scatter plot (ΔPSNR versus P2σ) for the AGU coder with four fitted polynomials, of the second, third, fourth, and fifth orders. Fitting has been carried out with the standard MATLAB tools; several parameters characterizing fitting are presented, and the coefficient of determination R2 and root mean square error (RMSE) are the main ones.50 It is usually considered that an R2 larger than 0.9 relates to good fitting and, together with a small RMSE, this indicates the possibility of accurate prediction. The obtained polynomials are also presented by scatter-plots for providing careful analysis.

Graphic Jump Location
Fig. 3
F3 :

The scatter plot (ΔPSNR versus P2σ) for the AGU coder with fitted polynomials of the (a) second, (b) third, (c) fourth, and (d) fifth orders.

Analysis of the data shows the following. A higher order of the used polynomial results in a better approximation according to all analyzed quantitative criteria. It seems enough to use the fifth order polynomial since it reflects the behavior of the considered dependence correctly and accurately. The fitted curves allow accurate determining of the point where ΔPSNR becomes positive. The best fitted curves take place for a P2σ about 0.80 to 0.82, and this is in agreement with conclusion 6 mentioned above.

Let us present data for the prediction of ΔPSNR based on P2.7σ. Note that the tendencies for the polynomial order are the same as for the scatter plot of ΔPSNR versus P2σ (see data in Table 3). For this reason, we present only the data for the fifth order polynomial in Fig. 4(a). The fitting is not as good as for the scatter plot ΔPSNR versus P2σ (R2 is slightly smaller and RMSE is larger). This means that it is better to use the statistical parameter P2σ for the prediction.

Table Grahic Jump Location
Table 3Parameters of fitted polynomials.
Graphic Jump Location
Fig. 4
F4 :

Scatter plots (a) ΔPSNR versus P2.7σ and (b) ΔPHVS versus P2σ and the fitted fifth order polynomials.

Let us now analyze the results obtained for the ΔPHVS prediction. Data for the prediction based on P2σ are presented in Table 3 (the lower part). As is seen, better results are obtained by increasing the polynomial order where the fifth order seems to be enough [see also the scatter plot and the fitted curve in Fig. 4(b)] Obviously, OOP exists only if P2σ exceeds 0.9. Analysis of data obtained for P2.7σ has shown that again, the prediction is less accurate (R2 reaches 0.946 and RMSE attains 0.47 for the fifth order polynomial).

Let us consider data for the ADCT coder (the recommended QS=3.5σ0). Some data in table format are presented in Ref. 29. It confirms conclusions 1 and 2 given above. Here, we prefer to represent the best results as scatter plots for the fifth order polynomial and the prediction based on probability P2σ (the results for the probability P2.7σ are again worse). These scatter-plots are depicted in Fig. 5.

Graphic Jump Location
Fig. 5
F5 :

Scatter plots (a) ΔPSNR versus P2σ and (b) ΔPHVS versus P2σ and the fitted fifth order polynomials for ADCT coder.

The scatter plot in Fig. 5(a) is very similar to the scatter plot in Fig. 3(d). The only difference is that the fitted curve crosses the zero level for a smaller P2σ (about 0.75). This means that the probability of having OOP for the ADCT coder in practice is larger than for the AGU coder. Other conclusions for the AGU coder practically coincide with the conclusions 3 to 5 for the AGU coder given above.

The scatterplot in Fig. 5(b) is very similar to the one in Fig. 4(b). The conclusions 7 and 8 given above for AGU are, in general, valid for the ADCT coder as well. The difference is that the fitted curve crosses zero level at a smaller P2σ (about 0.87). This means that the ADCT coder provides a slightly better visual quality of the compressed noisy images (under the condition of the same QS).

The results for the AGU-M coder have been obtained for SF=2.0σ0. Some quantitative data are given in Table 3 in the paper.29 Here, we prefer to present the obtained results as scatter-plots (see Fig. 6). The best fitting has been provided by the fourth (for ΔPSNR) and fifth (for ΔPHVS) order polynomials. Therefore, we give the fitted curves only for these cases.

Graphic Jump Location
Fig. 6
F6 :

Scatter plots (a) ΔPSNR versus P2σ and (b) ΔPHVS versus P2σ (b) and the fitted fourth (a) and fifth (b) order polynomials for the AGU-M coder.

Analysis shows the following. OOP does not occur for the complex structure images corrupted by nonintensive noise. According to the metric ΔPSNR, there is a high probability that OOP exists if P2σ exceeds 0.8 [this is seen in Fig. 6(a)] or if P2.7σ is less than 0.11. According to the metric ΔPHVS, OOP very likely exists if P2σ exceeds 0.9 [analyze the zero crossing in Fig. 6(b)] or if P2.7σ is less than 0.05.

There are no obvious advantages of the AGU-M coder compared to the AGU and ADCT coders. The results are similar for ΔPSNR if P2σ>0.7 and for ΔPHVS if P2σ>0.5. However, there are essential differences for a small P2σ, especially for the metric ΔPSNR. It might seem a serious drawback of the AGU-M coder that ΔPSNR for P2σ<0.5 is about 6 to 12dB. But this is not a problem. Large values of ΔPSNR happen for the cases of nonintensive noise when PSNRor is high (over 40 dB, see Table 1). Then, PSNRtc is also large, and the introduced distortions are not seen. The values for ΔPHVS of about 3 to 6dB can be regarded as large (sufficient loss of visual quality) if the PSNR-HVS-Mnc is rather low (e.g., 30 to 35 dB). But ΔPHVS of about 3 to 6dB occurs for noisy images with P2σ<0.5, i.e., for nonintensive noise. Then the visual quality is very high and distortions in compressed data cannot be noticed.

Consider now the results for the ADCT-M coder (SF=2.0σ0). For brevity, let us present only the scatterplots (Fig. 7) and the fitted polynomials of the fourth (for ΔPSNR) and fifth (for ΔPHVS) orders. Analyzing the metric ΔPSNR, we can conclude that OOP exists if P2σ exceeds 0.80 [see the fitted curve in Fig. 7(a)] or if P2.7σ is lower than 0.10. If the metric ΔPHVS is considered, compression in an OOP neighborhood is quite probable if P2σ exceeds 0.88 [analyze the fitted curve in Fig. 7(b)] or if P2.7σ is less than 0.06. The results are, in general, very close to the corresponding data for the AGU-M coder (see Fig. 6). Compared to the AGU and ADCT coders, the ADCT-M coder does not have obvious advantages.

Graphic Jump Location
Fig. 7
F7 :

Scatter plots (a) ΔPSNR versus P2σ and (b) ΔPHVS versus P2σ and the fitted fourth (a) and fifth (b) order polynomials for the ADCT-M coder.

Completing our analysis for the test images and simulated AWGN, we can state the following. The proposed prediction methodology is fast and accurate enough. The AGU coder and, especially, the ADCT coder provide, on average, better results for the considered cases of noisy image lossy compression than the AGU-M and ADCT-M coders, especially when oriented on visual quality. There could be three types of practical situations.

The “first one” takes place for low intensity noise which is invisible in the original images. It is difficult to expect that OOP exists, and formal lossy compression (according to quantitative criteria) leads to a worse image quality. This is usually not a problem since the introduced distortions are small and they are either invisible or hardly noticeable. Then it seems reasonable to carry out compression at OOP according to the earlier recommendations. Such a decision can be undertaken if, e.g., P2σ<0.6.

The “second situation” is that ΔPSNR and ΔPHVS are negative (or ΔPSNR is positive but small while ΔPHVS is negative). This happens if 0.6<P2σ<0.8. Then if lossy compression is performed in an OOP neighborhood, there is some filtering effect but the introduced distortions are at the same level or more. In such a case, it seems reasonable to carry out “a more careful” lossy compression. An example and a more thorough discussion will be given in the next section.

The “third practical situation” relates to P2σ exceeding 0.8. It seems obvious that it is worth carrying out lossy compression in an OOP neighborhood. Recall here that we assume that there is no limitation on CR provided by our approach to lossy compression. That is, we discuss how to make compressed data appropriate for further use (having a high enough quality) simultaneously wishing to provide a CR adaptive to image and noise properties and sufficiently larger than that for lossless compression.

For the aforementioned second situation (0.6<P2σ<0.8), it seems a reasonable decision to have a smaller CR than that reached for QS or SF recommended CRs for attaining OOP. One possible decision could be to apply a smaller QS or SF, e.g., QS=2.0σ0 for the AGU and ADCT coders and SF=1.2σ0 for the AGU-M or ADCT-M coders. It is expected then that noise suppression is smaller, but distortions are also less than for QS=3.5σ0 or SF=2.0σ0.

Let us see what happens in this case. Figure 8(a) presents the noisy image Airfield (σ0=10). The noisy image has been compressed by the AGU coder with QS=3.5σ0 (in OOP), and the result is demonstrated in Fig. 8(b). There is a noise-filtering effect that can be observed in homogeneous image regions (see the wide road in the left lower part). The introduced distortions are seen as well (sharp edges are partly smeared). The provided CR=6.35, ΔPSNR and ΔPHVS are both negative (they are equal to 0.64 and 2.61dB, respectively, see data in Table 2), P2σ=0.78, i.e., we deal with the second situation and, according to our recommendation, it is reasonable to use a smaller QS.

Graphic Jump Location
Fig. 8
F8 :

(a) Original noisy and compressed image Airfield with (b) QS=3.5σ0 and (c) QS=2.0σ0.

The image compressed by the same coder but with QS=2.0σ0 is shown in Fig. 8(c). The visual analysis allows stating that the compressed image quality is higher. This conclusion is confirmed by analysis of quantitative metrics: ΔPSNR and ΔPHVS are equal to 0.23dB and 1.10dB, respectively, i.e., they are sufficiently smaller than in the previous case. The negative outcome of such a decision is that CR reduces to 4.0. Certainly, in practice, one has to take into account both the quality aspects of the compressed images and the CR. The final decision is to be undertaken by a user.

Along with a hard decision to apply a smaller and fixed QS or SF if 0.6<P2σ<0.8, soft adaptation is also possible. One variant is to use Display Formula

QS(n)={σ0(2+(P2σ(n)0.7)2×150),if0.6P2σ(n)0.83.5σ0,otherwise,(10)
Display Formula
SF(n)={σ0(1.2+(P2σ(n)0.7)2×80),if0.6P2σ(n)0.82.0σ0,otherwise.(11)

Only this variant has been applied to the componentwise compression of real-life data.

Based on the obtained data and the analysis carried out, we can propose the following automatic procedure for RS sensing data lossy compression. The first operation is estimation of the noise characteristics, e.g., noise variance for AWGN for each image to be compressed if noise characteristics are a priori unknown (the case when noise is not additive is shown below). Then the statistical parameter P2σ has to be calculated. It can be substituted into the corresponding expression to calculate a given metric (e.g., ΔPSNR) for a given coder (it can be needed if a decision on the PCC setting is undertaken by a human or is based on ΔPSNR and/or ΔPHVS automatic analysis). The determination of ΔPSNR and/or ΔPHVS could be unnecessary if the decision is undertaken on the base of analysis (comparison of the thresholds 0.6 and 0.8) of P2σ, as shown at the end of Sec. 3.

The procedure described above can be applied componentwise if multichannel images have to be compressed. Let us show this for Hyperion hyperspectral data. We have to take into account several aspects here. First, the noise in such images is not additive (it is signal dependent23,24,27). This shortcoming can be overcome by applying a special algorithm of a generalized Anscombe transform30 described in Ref. 29. Then the noise after the direct transform becomes practically AWGN with unity variance. Second, one might doubt if OOP exists in the directly transformed compressed image and inverse transformed image. The simulation results presented in Refs. 20 and 28 show that this is true. Third, for real-life RS data, we do not have a true image, so no metric Metrtc can be calculated. Thus, we can only calculate the predicted values of ΔPSNR and/or ΔPHVS based on estimated P2σ or P2.7σ and look at how these predicted values behave, depending upon the sub-band index. Fourth, there are sub-bands in Hyperion RS data, which are usually not exploited (with indices 1 to 12, 58 to 76, 224 to 242). We also do not present data for these sub-bands.

Let us start by analyzing the values P2σ and P2.7σ for real-life data (the set EO1H1800252002116110KZ). The plots are given in Fig. 9. Analysis shows that both probabilities vary practically from zero to unity. Therefore, our decision undertaken above to obtain fitting curves for the entire intervals of probability variation was correct (see the beginning of Sec. 3). Joint analysis of the plots P2σ and P2.7σ shows that they behave in an “opposite manner” where the sum P2σ+P2.7σ is close to unity.

Graphic Jump Location
Fig. 9
F9 :

Dependences of (a) P2σ and (b) P2.7σ on sub-band index n for real-life Hyperion data set EO1H1800252002116110KZ (sub-bands with indices 1 to 12, 58 to 76, 224 to 242 are usually not exploited).

Analysis also shows that there is a very small percentage of sub-bands for which P2σ>0.8, i.e., for which compression in an OOP neighborhood providing visible effects of noise filtering can be carried out. These are sub-bands with indices 13, 14, and 127. There are also many sub-bands (most sub-bands of the optical range, 14<n<21, and some sub-bands of the infrared range (n=83, 125, 127, 173, 175, 177 to 181) for which “careful” compression is desirable. Recall here that n from 13 to 57 relates to the optical range and from 83 to 224—to the infrared range of the sensor Hyperion.

These conclusions also follow from the analysis of the predicted values of ΔPSNR(n) for the AGU coder shown in Fig. 10. As is seen, the obtained prediction results are very similar and they usually differ by not more than by 1 dB. The predicted values for ΔPHVS(n) based on P2σ are depicted in Fig. 11. They confirm the conclusions and recommendations given above. The number of sub-bands for which quality improvement of images compressed in the OOP neighborhood is expected is quite small; this is probable only for the sub-bands 13 and 14 of the optical range.

Graphic Jump Location
Fig. 10
F10 :

Predicted (a) ΔPSNR(n) based on P2σ and (b) P2.7σ for real-life Hyperion data set EO1H1800252002116110KZ (sub-bands with indices 1 to 12, 58 to 76, 224 to 242 are usually not exploited).

Graphic Jump Location
Fig. 11
F11 :

Predicted ΔPHVS(n) based on P2σ for real-life Hyperion data set EO1H1800252002116110KZ (sub-bands with indices 1 to 12, 58 to 76, 224 to 242 are usually not exploited).

The dependence CR(n) for fixed QS(n)=3.5 (recommended for OOP, noise STD equals to unity after the generalized Anscombe transform) is presented in Fig. 12 (blue line). This dependence correlates with all dependences presented in Figs. 10 and 11. It means that CR is quite large (exceeds 10) for sub-bands where noise is quite intensive (or, more correctly, where the input PSNR or SNR are quite small). In other sub-bands, noise is negligible. Because of this, compression is carried out by “carefully” introducing relatively small and practically invisible distortions. As a result, CR is in the limits 4 to 10.

Graphic Jump Location
Fig. 12
F12 :

Dependences CR(n) on sub-band index for real-life Hyperion data set EO1H1800252002116110KZ (sub-bands with indices 1 to 12, 58 to 76, 224 to 242 are usually not exploited). Red points relate to the proposed procedure [Eq. (10)].

Even more “careful” compression with a slightly smaller CR takes place if QS(n) is set adaptively according to Eq. (10) [CR for such sub-bands is shown by the red color, it can be considerably smaller than for QS(n)=3.5]. Note that the direct VST has been carried out as described in the paper.29

It might seem that the obtained values of CR are only slightly larger than those for the advanced lossless compression techniques. However, we feel that the proposed approach can be modified to 3-D compression applied in groups of sub-bands.21,29 Then an increase of CR by about twice is to be expected without additional degradation of the compressed image quality.

Figure 13 shows the original (uncompressed) image in the 14th sub-band and the image compressed using the proposed procedure with VST (after decompression and inverse VST). It is seen that noise is partly suppressed and the introduced distortions are not essential. In turn, Fig. 14 demonstrates original and decompressed images in the 110th sub-band. Noise cannot be seen in the original image and the compression practically does not influence the image quality (both images look the same).

Graphic Jump Location
Fig. 13
F13 :

The 14th sub-band images (a) before and (b) after compression.

Graphic Jump Location
Fig. 14
F14 :

The 110th sub-band images (a) before and (b) after compression.

The proposed approach to lossy compression has been also tested on the hyperspectral images acquired by airborne sensor AVIRIS. A peculiarity of this sensor is that it has four ranges with 32 sub-bands in the optical range and 64 sub-bands in the three latter infrared ranges. The dependences of P2σ(n) and ΔPSNR(n) for the standard test image Lunar Lake are shown in Fig. 15. As is seen, there are many sub-bands for which P2σ(n)>0.8 and for which 0.6<P2σ(n)<0.8. The obtained dependence of CR(n) is given in Fig. 16, where the curve for a fixed QS(n)=3.5 is again shown in blue color and the red dots indicate CR values obtained for a QS(n) determined according to the adaptive algorithm [Eq. (10)] for the AGU coder.

Graphic Jump Location
Fig. 15
F15 :

Dependences of (a) P2σ and (b) ΔPSNR on sub-band index n for real-life AVIRIS data set Lunar Lake.

Graphic Jump Location
Fig. 16
F16 :

Dependences CR(n) on sub-band index for real-life AVIRIS data set Lunar Lake. Red points relate to the proposed procedure [Eq. (10)].

The conclusions that follow from the analysis of data in Fig. 16 are the following. There are sub-bands where the noise intensity is high (e.g., optical range sub-bands with n=1, 2, 3, 4) for which P2σ is large and the compression is carried out, providing OOP and CR>10. Meanwhile, there are also sub-bands where the compression is “careful” and, thus, the CR is only about 4.

In our opinion, the proposed approaches and the obtained results can be also useful for applications other than RS. First, the results for the optical test images (see Tables 1 and 2) do not differ from the results obtained for the test RS images, i.e., the observed dependences and tendencies are general enough. Second, the results can be interesting for the researchers dealing with the compression of medical images and video where the prediction of ΔPSNR or ΔPHVS can be exploited to properly set QS or SF.