Monday, August 6, 2012

SOCIS - part 1 (introduction)

In this post I will talk about the SOCIS project I have recently been accepted to: GICI Delta's "Stabilization of an adaptive multi-line transform". The project's description is:
Stabilization of an adaptive multi-line transform
One of the transforms available in Delta is the Pairwise Orthogonal Transform (POT), which is applied independently in a line by line basis. We would like to have the possibility of introducing an smoothing factor so that two adjacent lines could be coded using a transform adapted for a particular line, but also without a step discontinuity from adjacent lines, so that the coding performance is improved. 
To elaborate on the subject, GICI stands for "Group on Interactive Coding of Images" and their main focus is the study of various image coding techniques with a particular interest for satellite image coding. In this case, we are dealing with hyperspectral images.

Hyperspectral images


As we know, normal images are represented in the RGB color model. A 24-bit Bitmap image is a m x n matrix (m is the number of lines and n is the number of columns) in which 24 bits are used to represent each pixel. Since each pixel must be a mixture of red, green and blue we will have 8 bits per color. Therefore, we are actually dealing with a m x n x 3 matrix, where 3 is the number of colors, or spectral bands.
A hyperspectral image contains a large number of spectral bands, as opposed to the familiar 3, and therefore is a m x n x z matrix, where z is the number of bands. Properties characterizing hyperspectral images are: the dimensions and number of bands, the precision used for representing values, the interleaving of bands (more information can be found here [1]) and the byte order (little endian or big endian).
Some sample images can be found here [2] and a simple Matlab script that I have written for visualizing such images can be found here [3].

Yellowstone


Compression


Since hyperspectral images can occupy a large portion of memory, a general compression scheme is needed in order to reduce their size. Compression can be either lossless (without loss of information) or lossy (with loss of information). The type of compression scheme we are interested utilizes transform coding which applies a transform to the initial data in order to obtain a better representation of the information content so that it can be easily compressed. Commonly used transforms for hyperspectral images are KLT (Karhunen-Loeve Transform) and the wavelet transform. KLT has a greater coding performance than wavelets but also has a higher computational cost, greater memory requirements, difficult implementation and lack of scalability.

Karhunen-Loeve Transform


In a nutshell, KLT is applied to a set of vectors which represent lines or band-lines from a given image. The transform is dependent on data correlation between vectors, so the first step is to compute the covariance matrix of the vectors. It is from this matrix that the transform matrix is obtained and applied to each vector. In practice, determining the covariance matrix for a set of vectors is a computationally expensive task. To address this issue, the Pairwise Orthogonal Transform (POT) was developed, which has a greater coding performance than the wavelet transform and lower computational requirements than KLT.

Pairwise Orthogonal Transform


POT, instead of computing the covariance matrix for all vectors (image components), uses a divide-et-impera approach in which the resulting transform is a composition of smaller KLT transforms applied to pairs of image components. Assuming we have n components, KLT is applied to 1 and 2, 3 and 4 etc. Each transform will result in 2 other components from which we retain only the first one. The process is repeated with these new components and so on. We immediately notice the reduced temporal complexity of the algorithm with respect to KLT. POT is applied line by line, which leads to a reduction in memory usage. In the case of lossy compression and at low bitrates, artifacts appear on images because of POT's line-based approach. The main objective of this project is to reduce these artifacts and improve coding performance by introducing a smoothing factor.

No comments:

Post a Comment