This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

An Innovative Lossless Compression method for Discrete-Color Images Saif alZahir, Member, IEEE, and Arber Borici, Student Member, IEEE

Abstract — In this paper, we present an innovative method for lossless compression of discrete-color images such as map images, graphics, GIS as well as binary images. This method comprises two main components. The first is a fixed-size codebook encompassing 8×8  bit blocks of two-tone data along with their corresponding Huffman codes and their relative probabilities of occurrence. The probabilities were obtained from a very large data set of two color images (binary) and are used for arithmetic coding. The second component is the row-column reduction coding, which will encode those blocks that are not in the codebook. The proposed method has been successfully applied on two major image categories: (i) images with a predetermined number of discrete colors such as digital maps, graphs, and GIS images; and (ii) binary images. The results show that our method compresses images from both categories (discrete color and binary images) with 90% in most case and higher than the JBIG-2 by 5% to 20% for binary images, and by 2% to 6.3% for discrete color images on average.

Index terms—Binary image compression, discrete-color image compression, digital map compression, charts and graphs compression

1 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

I.

INTRODUCTION

This work presents the details of a new, innovative lossless compression method that is suitable for compressing a broad class of binary and discrete-color images. Our focus has been on constructing a codebook model for bi-level data. This codebook can be used, in turn, for images with a predetermined number of discrete colors by separating such images into binary layers and compressing the latter. Examples of discrete-color images include political and topographic maps, graphs, and charts. Consider a dynamical system comprising an information source, which assembles uniformly sized chunks of binary images, and a channel that outputs them. Suppose these chunks resemble mosaic tiles, which when assembled in some way will reproduce and give meaning to the image conveyed by the mosaic. In light of that metaphor, assume that the procedure of generating those image chunks obeys a stationary stochastic process. For every time shift, the distribution of such chunks should remain the same. Consequently, the probabilities may be used to determine the compression terminus for each and every possible binary image that the fictitious source could assemble. An appropriate theoretical compression method could then be devised. This approach would lead to a (quasi-) universal and simple method for compressing binary images. For reasons that will be laid out in Section II, we specified the aforementioned chunks as nonoverlapping 8×8-bit blocks. It is appropriate to mention that 8×8 blocks have previously been used in [1] to construct local geometrical primitives for a given binary image. In this work, we specify theoretical as well as empirical reasons for choosing non-overlapping 8×8 blocks. In order to construct a large sample of binary images, we considered collecting and partitioning binary images into 8×8 blocks to examine the overall system entropy. Entropy coders, such as Huffman and Arithmetic coding, add to the idea of developing a universal codebook comprising pairs of 8×8 blocks and their Huffman or Arithmetic codes. In order to achieve such a codebook, we constructed a system of binary images randomly collected from different sources and diverse as possible in their pixel representations. Thereafter, we eliminated images that contained salt-andpepper noise. This preprocessing proved to be useful in constructing an unbiased codebook. Finally, we studied the relative probabilities of all blocks in the sample and, thus, we calculated the entropy of binary

2 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

images based on the relatively large sample. We used the probability distribution of 8×8 blocks to construct Huffman and Arithmetic codes for all blocks with absolute frequency greater than 1. We are aware that a stochastic discrete dynamic system may assemble infinitude of binary images for a determining a precise entropy value. A hypothetical source such as the one described here may not produce improbable sequences of blocks, any more than an equivalent source may not produce sequences of incomprehensible, say, English words. As we shall state subsequently, such sequences are merely unlikely. We focus on the theoretical and empirical average measures pertaining to blocks of binary images. In light of that, the sample we constructed is representative of the most frequently occurring binary image blocks. The remainder of the article is organized as follows. In Section II, we briefly describe related work. In Section III, we expose the proposed compression method in detail. We provide a basic theoretical background and lay some definitions pertaining to the context of this work. Then, we exhibit the construction of a fixed-to-variable codebook per the motivation described above and the row-column reduction coding, an algorithm that removes additional redundancy in 8×8 blocks. Empirical results are exposed in Section IV, categorized according to the related areas of applications. We provide compression results based on two general encoding schemes that we developed for the codebook. Conclusions and Future Work follow in Section V.

II.

RELATED WORK

Central to the proposed method in this work is the idea of partitioning a binary image or the bilevel layers of discrete-color images into non-overlapping 8×8 blocks. Partitioning binary images into blocks and encoding them, referred to as block coding, has been summarized in [2], wherein images are divided into blocks of totally white (0-valued) pixels and non-white (1-valued) pixels. The former blocks are coded by one single bit equal to 0, whereas the latter are coded with a bit value equal to 1 followed by the content of the block in a row-wise order. Moreover, the hierarchical variant of the block coding relies in dividing a binary image into bxb blocks (typically, 16x16), which are then represented in a quad-tree structure. In this case, a 0-valued bxb block is encoded using bit 0, whereas other blocks are coded with bit 1 followed by recursively-encoded blocks of pixels with the based case being one single pixel. In [3] [4], it

3 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

is suggested that block coding can be improved by resorting to Huffman coding or by employing contextbased models within larger blocks. In [5], a hybrid compression method based on hierarchical block coding is proposed. Here, predictive modeling has been employed to construct an error image as a result of the difference between the predicted and original pixel values. Then, the error image is compressed using Huffman coding of bit patterns at the lowest hierarchical level. This work builds upon the ideas presented in [6] [7], wherein block coding with Arithmetic coding has been employed. In the context of discrete-color images, lossless compression methods are generally classified into two categories: (i) methods applied directly on the image, such as the graphics interchange format (GIF) or the portable network graphics (PNG); and (ii) methods applied on every layer extracted (or separated) from the image, such as TIFF-G4 and JBIG. In this research, we focus on the second category. Previous work in literature amounts to several lossless compression methods for map images based on layer separation. The standard JBIG2, which is specifically designed to compress bi-level data, employs context-based modeling along with arithmetic coding to compress binary layers. In [8], a lossless compression technique based on semantic binary layers is proposed. Each binary layer is compressed using context-based statistical modeling and arithmetic coding, which is slightly different from the standard JBIG2. In [9], a method which utilizes interlayer correlation between color separated layers is proposed. Context-based modeling and arithmetic coding are used to compress each layer. An extension of this method applied on layers separated into bit planes is given in [10]. Moreover, other lossless compression standards, such as GIF, PNG, or JPEG-LS are applied directly on the color image.

III.

THE PROPOSED METHOD

The purpose of this section is to explain the analytical details and components of the proposed compression method. The foundational armor of the method consists of an admixture of theoretical and empirical arguments. It is the objective of each subsequent section to provide the reader with these arguments as well as with the theoretical context pertaining to the proposed lossless compression algorithm.

A. The Codebook Model

4 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

The proposed method operates on a fixed-to-variable codebook, wherein the fixed part consists of 8×8 blocks and the variable part comprises codes corresponding to the blocks. In order to devise an efficient and practical codebook, we performed a frequency analysis on a sample of more than 250000 8×8 blocks obtained by partitioning 120 randomly chosen binary data samples. By studying the natural occurrence of 8×8 blocks in a relatively large binary data sample, the Law of Large Numbers motivates us to devise a general (empirical) probability distribution of such blocks. In principle, this could be used to construct a universal static model, which can be employed for compressing efficiently (on average) all sorts of bi-level images. The images have different sizes and varied from complex topological shapes, such as fingerprints and natural sceneries, to bounded curves and filled regions. The candidates were extracted from different sources, mainly randomly browsed web pages and public domain databases. Because these representative images are widely available, it is reasonable to deduce that they are more likely to be considered for compression. Also, the main criterion in constructing an unbiased data sample was that the images should convey the clear meaning they were constructed to convey without unintentional salt-andpepper noise. The images dimensions vary from 149x96 to 900x611 bits, yielding approximately 250,000 8×8 blocks. This image set is different from those image used to test JBIG2, for example. Before proceeding to the frequency analysis, we preprocessed binary images in two phases. The first phase consisted of trimming the margins (or the image frames) in order to avoid biasing distribution of 0-valued or 1-valued 8×8 blocks. In the second phase, we modified image dimensions to make them divisible by 8 for attaining an integral number of 8×8 blocks. Trimming images in order to remove the redundant background frame is an important step of the first preprocessing phase. Preserving such frames would increase the relative probability of 8×8 blocks consisting of zeros or ones if the background is white or black, respectively. Consequently, the probability of such blocks would create a skewed distribution of blocks in the codebook. It has been reported that Huffman codes do not perform as efficiently as Arithmetic codes with such a distribution of symbols unless the probabilities can be redefined [11] [12]. An additional step in the preprocessing phase consists of making the image height and width divisible by 8. Let w and h denote the width and height of an image. We convert w and h to 𝑤 ∗  and ℎ∗ such that 8|𝑤 ∗ and 8|ℎ∗ : •

If ℎ  𝑚𝑜𝑑  8 ≠ 0, then ℎ∗ = ℎ + 8 − ℎ  𝑚𝑜𝑑  8;

5 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing



If  𝑤  𝑚𝑜𝑑  8 ≠ 0, then 𝑤 ∗ = 𝑤 + 8 − 𝑤  𝑚𝑜𝑑  8.

Thus, the new image dimensions are 𝑤 ∗ ×ℎ∗ . The newly padded vector entries are filled with the respective background bit. If the background color in the binary image is white (represented conventionally with 0), the new entries will be filled with 0-valued bits. Having gone through the two aforementioned phases, we performed the frequency analysis on the 250,000 8×8 blocks and we used these relative probabilities to construct the codebook model. In terms of the coding paradigm, we resorted to both Huffman and Arithmetic coding, as will be illustrated below. From an information theoretic standpoint, we consider the images in the data sample, exposed above, to have been generated by the hypothetical source described in Section 1.1. As such, the set of 8×8 blocks may be characterized as a discrete stochastic process defined over a very large discrete alphabet of size equal to 2!" symbols that represent all possible patterns of zeros and ones in an 8×8 block. Essentially, one can study the distribution of 8×8 blocks for a relatively large data sample. It is, however, not possible to estimate empirical probabilities for all 2!" 8×8 blocks, and it is certainly not feasible or time efficient to construct a codebook containing all possible blocks and the corresponding Huffman codes. Therefore, one should consider devising a codebook comprising the most frequently occurring blocks. In general, the more one increases block dimensions, the smaller the waiting probability of observing all possible blocks becomes because the size of the alphabet increases exponentially. Thus, it would be reasonable to have an expected value of the number of trial samples required to observe all the possible 8×8 blocks. The latter problem of determining the waiting probability of observing a particular number of blocks and the expected number of samples needed may be viewed as an instance of the more general Coupon-Collector's Problem, which is elegantly posed in [13] and is summarized as follows: Let S denote the set of coupons (or items, in general) having cardinality 𝑆 = 𝑁. Coupons are collected with replacement from S. Then, the problem poses two questions: (i) What is the probability of waiting more than n trials in order to observe all N coupons?; (ii) What is the expected number of such trials? Let M denote the set of observed coupons and T be a random variable. More accurately, M is a multiset of coupons because coupons are drawn from S with replacement. To answer the first question it is more convenient to consider the probability of collecting more than n coupons, i.e. 𝑃(𝑇 > 𝑛). The required

6 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

probability 𝑃(𝑇 = 𝑛) is easily derived as 𝑃 𝑇 = 𝑛 = 𝑃 𝑇 > 𝑛 − 1 − 𝑃(𝑇 > 𝑛). The following model gives 𝑃(𝑇 > 𝑛):

n

M ⎛ M ⎞⎛ M − i ⎞ P(T > n) = ∑ (−1) i +1 ⎜⎜ ⎟⎟⎜ ⎟ . i M ⎝ ⎠ ⎝ ⎠ i =1

(1)

To determine the expected number of trials, 𝐸 𝑇 , required to collect n coupons we use the model 𝐸 𝑇 = 𝑛𝐻! , where 𝐻! is the harmonic number. Based on the asymptotic expansion of harmonic numbers, the following asymptotic approximation for 𝐸 𝑇 holds:

E[T ] = n ln n + γ n +

1 + o(1) 2

as

n→∞ ,

(2)

where 𝛾 ≈ 0.577 is the Euler-Mascheroni constant. If we have 𝑛 = 2!" coupons, then the expected number of trials bounded by equation (2) is equal to 8.29×10!" , which implies a practically unattainable number of trials. In the context of this research, we consider “coupons” to be the 8×8 blocks for a total of 2!"  symbols. Hence, we are interested in determining the number of blocks we must collect from the dynamical system in order to have observed all possible blocks. The probability in equation (1) is difficult to compute for all possible 8×8 blocks. However, we may resort to the asymptotic approximation of the expected number of trials required to observe all blocks using the following formula:

E[T ] = 2 64 ln 2 64 + 0.58 ⋅ 2 64 .

(3)

The result in equation (3) implies that we need to compile a practically huge number of samples in order to attain a complete set of 8×8 blocks and to estimate, in turn, all relative probabilities. Based on the latter remark, the only way to reduce the number of samples needed would obviously be to reduce the block dimensions from 8×8 to, say, 4x4, so that the alphabet size reduces to 2!" . We did experiment with blocks of smaller dimensions in order to decrease the alphabet size. However, the efficiency of Huffman codes for smaller block dimensions decreased. For instance, for 2x2 blocks the expected number of samples needed to observe all possible blocks is approximately equal to 55. Also, the waiting probability given in formula (1) tends to an arbitrarily small value for larger values of the number of samples needed. For example, 𝑃 𝑇 = 100 = 0.0017, which implies that all 2x2 blocks will certainly

7 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

be observed. For 4x4 blocks, on the other hand, the expected number of samples needed would be approximately equal to 764647. Yet, this number would impose difficult computations in determining both relative probabilities and Huffman codes. The fact that decreasing block dimensions decreases the maximum compression ratio can be explained by the following observation. In general, suppose the entropy for nxn blocks is H. Then, the compression ratio upper bound (in bpp) is 𝐻 𝑛! , i.e. the number of bits required to encode a 𝑛×𝑛 block is inversely proportional to the square of the block dimensions. Thus, compression ratio per block increases as longer block dimensions are considered and decreases otherwise.

For example, the entropy of the

system comprising all 65,536 4x4 blocks was observed to be equal to 2.12 bits per block, yielding a satisfactory compression limit of 86.74%. However, this would be the case if Huffman codes could be derived for all such blocks. Constructing Huffman codes for such a large cardinality is practically inefficient. Therefore, one needs to consider an empirical balancing between block dimensions, entropy, and codes. Based on such considerations, we decided to study the empirical distribution of 8×8 blocks. Table 1 shows various candidate blocks dimensions along with the resulting entropy values and the expected sample sizes. An increase in entropy is expected as block dimensions increase because the alphabet size becomes larger, thus increasing the number of possible states. The theoretical maximum compression ratio in percentage, however, increases proportionally to the square of the block dimension, as stated above. At this point, one may conclude that using 8×8 blocks consists a better choice than considering 2x2 or 4x4 blocks, or blocks with smaller dimensions than 8×8. After all, even for 4x4 blocks the number of samples required to observe all such blocks and to deduce a more reliable empirical probability distribution is practically unattainable and offers no promising compression ratio. On the other side, the expected samples needed to observe all blocks increases exponentially, as can be noticed from the last column of Table 1. Despite the increase in entropy, the alphabet size imposes an empirical limit in selecting blocks larger than 8×8. Also, the probability that blocks not in the codebook will be compressed by the row-column reduction coding decreases with an increase in vector size, as shall be observed in Section 2.2. In all, our choice of 8×8 blocks is based on these strains and would probably be no different than selecting 7x7 or 9x9 blocks, except for some decrease or increase in the theoretical compression ratio, and the feasibility in handling Huffman or Arithmetic coding.

8 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

Table 1: Effect of block dimensions on entropy and the expected sample size Block 2x2 4x4 8×8 12x12 16x16

Alphabet Size 16 65536 2!" 2!"" 2!"#

Entropy 1.36 2.12 4.09 7.89 9.72

% Max. Compression 66.00 86.74 93.60 94.52 96.20

E[T] 55 764647 10!" 10!" 10!"

Note that the entropy values are extrapolated from the analysis we carried out on 4x4 and 8×8 blocks. These approximate values should suffice to give a general idea of how entropy and compression ratio vary with block dimensions. In the data sample of 120 images, we identified a total of 65534 unique blocks. From this total, we selected the 6952 blocks that occurred 2 times or more and discarded all other blocks with an absolute frequency equal to 1. Thus the cardinality of the fixed-to-variable codebook equals 6952 entries. This is a small number compared to the total number of blocks equaling  2!" . Since the codebook contains such a small fraction of 8×8 blocks and since we do not know the theoretical probability distribution of blocks, it is reasonable to provide an estimate for the error between the theoretical and the empirical average code lengths (or entropies). The observed average code length, L, is given by

N

L = ∑ qi log 2 i =1

1 , qi

(4)

where 𝑞! is the empirical distribution and N is the number blocks. Similarly, we define the theoretical average code length for the theoretical probabilities 𝑝! :

N

H = ∑ pi log 2 i =1

1 . pi

(5)

Then, we examine the model:

N ⎛ 1 1 E = L − H = ∑ ⎜⎜ qi log 2 − pi log 2 qi pi i =1 ⎝

⎞ ⎟⎟ ⎠

(6)

Define the discrepancy 𝜀! between the empirical and the theoretical probabilities as:

ε i = qi − pi

(7)

9 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

for

i = 1, 2, ..., N . We expand asymptotically on 𝜀! to derive an error approximation of the average code

length. Based on the Law of Large Numbers, we should have First, observe that

L( q i ) = H ( p i + ε i )

and

ε i → 0 (or to some arbitrarily small value). E = H ( pi + ε i ) − H ( pi )

based on (7).

Therefore, we focus on the asymptotic expansion of function H. The first derivative of function

H ( pi ) = − pi log 2 pi

is:

H ' ( pi ) = −

1 − log 2 pi ln 2

(8)

We look for an expression such as the following:

[− ( pi + ε i ) log 2 ( pi + ε i ) − H ( pi )] ~ H ' ( pi ) ⋅ ε i

(9)

as ε i → 0

where the left-hand side of the relation is a representative form of the error function E. Note that expression (9) is the discrete version of the first-order Taylor series. For fixed N and

pi ,

the following series is

constructed:

H ( pi + ε i ) =

H (n) ( pi ) ∑ n! (εi )n n =0 ∞

The dominant term of the series in equation (10) is the function

H ( pi + ε i )

(10)

H ( pi ) . Subtracting this function from

yields the following series:

∞ H ( n) ( p ) i

H ( pi + ε i ) − H ( pi ) = ∑

n =1

n!

(ε i ) n . (11)

The left-hand side of equation (11) is the error function E defined in (6). Summing over all N terms, we get the following expression:

N ⎡ ∞ H ( n) ( p ) ⎤ i E ~ ∑ ⎢ ∑ (ε i ) n ⎥ as ε i → 0 n! ⎥⎦ i =1 ⎢⎣n =1

(12)

10 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

For most purposes, a second-order Taylor expansion is appropriate to provide useful insight into errors. We may write (12) using Landau notation as follows:

N ⎡ ∞ H ( n) ( p ) ⎤ i E = ∑ ⎢ ∑ ε i n + o(ε i n )⎥ n! ⎥⎦ i =1 ⎢⎣n =1 as ε i → 0

(13)

The second-order asymptotic expansion of E based on the series in (11) would thus be:

2 N ⎡ ⎤ 1 εi ⎛ 1 ⎞ E = ∑ ⎢− ε i ⎜ + log 2 p i ⎟ − + o(ε i 2 )⎥ ⎝ ln 2 ⎠ 2 ln 2 p i ⎥⎦ i =1 ⎢⎣ as ε i → 0

(14)

The asymptotic approximation in (14) implies that discrepancies between the theoretical and empirical average code lengths are negligible as blocks in the codebook. If we let

ε i → 0 . However, in our case we included only 6952

N = 6952 in (14), we have to add an additional error term for all other

possible blocks not included in the codebook. Practically, we consider (6), the additional error term would thus be equal to: −

!!" !!!"#$ 𝑝!

qi = 0 , ∀i > 6952 .

Based on

log ! 𝑝! . In our view, these theoretical

probabilities are very small and have a minor effect on the code length error. This fact is, however, one of the motivations that incited us to develop the additional coding module – the row-column reduction coding – as will be illustrated in the next section. As mentioned before, the constructed codebook is a set of pairs (b, HC(b)), where b is an 8×8 block and

HC (b)

is the Huffman code of b. The Huffman code length

L (b) varies from 1 bit to 17 bits,

while the average code length is 4.094 bits, which is greater than the codebook entropy value of 4.084 bits per block. The difference between the average code length and the entropy value, which in our case equals 0.01, is referred to as the redundancy values. In percentage, the redundancy is found to be 0.252% of the entropy. This means that, on average, we need 0.252% more bits than the minimum required in order to code the sequence of blocks in the codebook. In compliance with the asymptotic expansion of the error given in (14), this value, too, exposes minor excess in coding lengths and accounts for a near-optimal encoding of blocks by means of the codebook.

11 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

In the context of the modeling and coding paradigm the constructed codebook acts as the static modeling portion of the proposed compression method. In static modeling, statistics are collected for most or all alphabet symbols in order to construct representative codes. While static modeling reduces the complexity of the decoder [14] [15], it is not widely used in practice because sample data may not always be representative of all data [16]. Albeit in this section, we tackled a way to construct efficient and representative codes for the most frequently 8×8 blocks of binary images. The analysis on the constructed codebook suggests a small lower bound and a small asymptotic upper bound on the discrepancy between theoretical and empirical code lengths. Moreover, having established a compression model based on a fixed-to-variable codebook, we have selected Huffman and Arithmetic coding to implement the coder. The former has been presented in this section, whereas the latter will be exposed when the coding schemes are illustrated. As a final note to this section, to calculate the frequencies of all unique 8×8 blocks observed in the data sample, the program we designed executed for approximately 500 hours on an Intel Dual Core machine at 1.6 GHz per processor and 2.4 GB of RAM. B. The Row-Column Reduction Coding The codebook component of the proposed method is efficacious in compressing the 6952 blocks it contains. These blocks, as seen in the previous section, are the most frequently occurring symbols as per the empirical distribution. Compared to the alphabet size of  2!" , the cardinality of the codebook is very small. Hence, there will be blocks from input images that cannot be compressed via the codebook. For that purpose, we designed the row-column reduction coding (RCRC) to compress 8×8 blocks of a binary matrix,

V , that are not in the codebook, C . In this section, we illustrate how the algorithm works.

The RCRC is an iterative algorithm that removes redundancy between row vectors and column vectors of a block and proceeds as follows. For each reference vector (RRV), denoted as

8 × 8 block b , b ∈ V , b ∉ C , RCRC generates a row

r , and a column reference vector (CRV), denoted as c . Vectors r

and

c may be viewed as 8-tuples which can acquire values ri = {0, 1} , ci = {0, 1}, i ∈ {1, ..., 8}. These vectors are iteratively constructed by comparing pairs of row or column vectors from the block

b . If rows

or columns are identical in a given pair, then the first vector in the pair eliminates the second vector, thus

12 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

reducing the block. If the two vectors are not identical, then they are both preserved. The eliminations or preservations of rows and columns are stored in the RRV and CRV, respectively, which are constructed in a similar way. The iterative construction procedure is exposed in what follows. Let b i, j denote row

i of block b , for j = 1, 2, … , 8 . RCRC compares rows in pairs starting with the

first two row vectors in the block:

r2

(b

1, j

,b 2, j ). If b1, j = b 2, j , we assign a value of 1 to r1

and a value of

and we eliminate row

b 2, j from block b . Next, b1, j is compared to b3, j and if they are

equal, a value of 0 is stored in

r3 implying that row b 3, j is eliminated from the block. If, however,

0 to

b1, j ≠ b 3, j , then a value of 1 is assigned to r3 implying that the third row of block b is preserved. Thereafter, RCRC will create the new pair of rows

(b3, j , b 4, j ) and the comparison proceeds with those

two rows as above. This procedure iterates until RCRC compares rows in the pair that contains the last row of block

b . By convention, ri = 1 means that the i − th row in block b has been preserved whereas

ri = 0 denotes an eliminated row. Clearly, r1

and c1 will always take on a value of 1.

The result of these RCRC operations will be a row-reduced block. Next, RCRC constructs the column reference vector based on the row-reduced block. There will be at most 7 pairs of column vectors to compare depending on the number of eliminated rows. The CRV is constructed in a similar way as the RRV through the procedure illustrated above. The final result will be a row-column-reduced block. To explain this process, consider Figure 1 below, the row reduction operation is applied on a selected block. The row reference vector (RRV) is shown on the left of the block. In this case, the first row is identical to the second row, which is removed from the block. Therefore, a value of ‘1’ is placed in the first location of the RRV (for the first row), and a value of ‘0’ is placed in RRV second location for the second (eliminated) row. Finally, a value of ‘1’ is placed for the third row in the third RRV location, and a value of ‘0’ is placed for the corresponding RRV locations of rows 4 to 8, which are eliminated since they are identical to row 3. The column-reduction operation is applied on the row-reduced block, as depicted in Figure 2. The column-reference vector (CRV) is shown on top of the block. In this case, the first column is identical to

13 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

and eliminates columns 2 to 6. Also, column 7 eliminates column 8. This results in the reduced block, RB, shown on the right of the column-reduced block. For this example, the output of the RCRC is a concatenated string composed of the RRV (the first group of 8 bits), CRV (the second group of 8 bits), and the RB (the last 4 bits) displayed as a vector: 10100000100000101011, for a total of 20 bits. The compression ratio achieved for this block is (64 – 20)/64 = 68.75%.

RRV

1 0 1 0 0 0 0 0

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

0 0 1 1 1 1 1 1

0 0 1 1 1 1 1 1

1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 Row-reduced block

Figure 1. The row-reduction operation applied on a block

CRV

1 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1

1 0 1 1 The reduced block

Figure 2. The column-reduction operation applied on the row-reduced block in Figure 1

To further clarify the decoding process, we consider the row-column reduced block of the preceding example. The number of ones in the row-reference and column-reference vectors shows the number of rows and columns of the reduced block respectively. The output 10100000100000101011 contains two ones in the first group of 8 bits (the RRV), and two ones in the second group of 8 bits (the CRV). This means that there are 2 rows and 2 columns in the reduced block. That is, the first two bits of the reduced block i.e., ‘10’ of the underlined bits, represent the first reduced row, and the second two bits, ‘11’ of the underlined portion, represent the second reduced row. Then, given the ones and zeros in the reference vectors, we construct the rows and columns of the original block. Figure 3 shows the column reconstruction based on the column-reference vector. 1 0 0 0 0 0 1 0 CRV

1 1 1 1 1 1 0 0

1 0

1 1 1 1 1 1 1 1

1 1 The reduced block

The reconstructed columns

Figure 3. Column reconstruction based on the column-reference vector (CRV)

14 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

In Figure 3, the CRV tells the decoder that columns 2 to 6 are exact copies of column 1, and column 8 is an exact copy of column 7. The block on the right depicts this operation. Figure 4 shows the row reconstruction process to obtain the original block. If the output of the row–column reduction routine has a size larger than 64 bits, we keep the original block. Such cases represent less than 5% of the total block count in the test samples. Moreover, the minimum number of bits that can be achieved by RCRC is 17 bits, for a maximum compression ratio of 73.44%, and occurs in cases where all 7 rows and 7 columns of the block are eliminated which leads to 1 bit reduced block. 1

RRV

0

1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 The decoded block

1 0

1 1 1 1 1 1 0 0

0

1 1 1 1 1 1 1 1

0 The row-reduced block 0 0

Figure 4. Row reconstruction based on the row-reference vector (RRV) The schematic diagram shown in Figure 5 illustrates the operation of the proposed compression scheme. Each map image is appended in both dimensions to become divisible by 8. Then, layers are extracted through color separation yielding a set of bi-level matrices. We partition each layer into 8×8 blocks. Each 8×8 block of the original data is searched in the codebook. If it is found, the corresponding Huffman code is selected and added to the compressed data stream. If it is not found, the row–column reduction coding attempts to compress the block. If the output of RCRC is smaller than 64 bits, we add the reduced block to the compressed data stream. Otherwise, we keep the original block.

No

Append test data

Compressed Yes RCRC(B) < Data Use reduced 64 bits? block

Layers

Input

Color Separation

8x8 blocks

Use the original block Bk

Out

No

Block Bk

Bk ∈ Codebook ?

Yes Use code of block Bk

Figure 5. Diagram of the proposed compression technique

15 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

C.

Complexity Here we give an analytical time complexity analysis for the proposed method. Let h and w be the

dimensions of an input binary image matrix. For each 8×8 block, the algorithm performs a binary search on the codebook for a matching block. If a match is detected, the block is compressed and the next 8×8 block is processed. The codebook has a fixed size of 6952 entries; therefore, it has constant complexity. If a match is not found in the codebook, RCRC will attempt to compress the block. In the context of the proposed method, the RCRC input is of fixed size and has constant complexity, too. The codebook search and RCRC will be executed for at most 1/64wh 8×8 blocks. Thus, the total complexity of the proposed method is O(hw). In Section 2.1, we noted that codebook entries are sorted in descending order based on their empirical probabilities. While this fact does not contribute to the analytical time complexity, we observed that it had some impact on the empirical complexity metric of the proposed method. A general complexity argument may be provided for the hypothetical case when a variable-length codebook and blocks of variable dimensions are considered. Suppose the codebook has M entries and the blocks are of size n×n. For the codebook component, a binary search is performed on the blocks, wherein each codebook block is compared with the input block. Block comparison has a complexity of 𝑂 𝑛! and binary search is  𝑂 log 𝑀 . Thus, the total complexity for the codebook component would be  𝑂 𝑛! log 𝑀 . The RCRC complexity is 𝑂 𝑛! , since for each of the n rows, n block entries are compared (for n columns). These operations are done for equal to

wh / n 2

n×n blocks. The total number of operations is at most

(1 / n 2 ) wh(n 2 log M + n 2 ) ; thus, the total complexity of such a hypothetical case is

O( wh log M ) . It is interesting to observe that the hypothetical time complexity of the proposed method does not explicitly depend on block dimensions, in general, but on the codebook size and the input image dimensions.

D. The Coding Schemes

16 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

Here we present two coding schemes fine-tuned with the proposed codebook model and the RCRC algorithm. In order to distinguish blocks compressed by the codebook, or blocks compressed by RCRC, or uncompressed blocks, we consider the following cases.

The first Scheme: Case 1: If the block is found in the codebook, then use the corresponding Huffman code. We consider two sub-cases. - Case 1a: If the corresponding Huffman code is the shortest in the codebook (i.e. 1 bit in the case of our codebook) then assign ‘11’ to encode that block. - Case 1b: If the corresponding Huffman code has a length > 1 bit, assign ‘00’ to encode the block. As the length of the Huffman codes in our codebook varies from 1 bit to 17 bits, we use an additional 5 bits to encode the length of the Huffman code immediately after ‘00’ bits. For instance, if a block that is found in the codebook has a Huffman code of length 7, then the block will be encoded as 00 + 00111 + The Huffman Code. In this case, the 5 bits (00111) tell the decoder that the Huffman code that follows immediately has a length equal to 7 bits and, thus, the decoder will read the next 7 bits. The reason we use two overhead bits for the block that has the shortest Huffman code length in the codebook is based on the empirical result that this entry comprises 73.74% of the total blocks found in the codebook. Case 2: If the block is compressed by RCRC, we use ‘01’ followed by the output bits of RCRC. The decoding process for RCRC is explained in Section 2.3. Case 3: If the block is neither found in the codebook, nor compressed by RCRC, we use the two bits ‘10’, after which the 64 bits of the original block data are appended. The decoding process is straightforward and follows the same encoding process explained above. These cases lead to the following model. Let image. Define the partitions

V be the set of 8 × 8 blocks b of some input

C , R , and U on set V respectively as the sets of blocks found in the

codebook, blocks not found in the dictionary but compressible by RCRC, and incompressible blocks. Let function

LH (b)

denote the length of the Huffman code of block

length of the compressed bit stream produced by RCRC. Let

ξ

b and function LR (b) denote the

be a random variable denoting the random

17 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

event

ξ = bV , bV ∈V

and let

P(ξ = bV ) = p(bV )

frequency approach, one can derive the relative probabilities

denote the probability of

ξ.

p(bC ) , p(bR ) , and p(bU )

Based on a of blocks in

C , R , and U , respectively. Then, the expected compression size in bits is given by the following model:

1 1 E p [ξ ] = p (bC ) ⋅ 2 L(bC )

(Case 1a )

C

+

∑ p(bCi ) ⋅ [7 + L(bCi )]

(Case 1b )

i=2 R

+

(15)

∑ p(bRj ) ⋅ L(bRj )

(Case 2)

j =1

U

+ 66 ∑ p (bUk ) ,

(Case 3)

k =1

where

1 is the relative ) ⋅ denotes the set cardinality and E p [ ⋅ ] is the expectation operator. Here, p (bC

probability of the block in

C that has the shortest code. In the case of our codebook, the length

1 L(bC ) = 1 bit, and the length of the block encoding bits for Case 1a is 2. For the remaining blocks in C , the encoder uses 7 overhead bits on top of the code (Case 1b). For Case 2, the expected lengths of RCRC bit streams are summed over, while for Case 3, the encoder uses the uncompressed block (64 bits) and 2 bits of overhead data. Based on this model it is easy to see that the expected compression ratio is equal to

E P [ξ ] 64

bits per block.

Alternative Coding Scheme One may speculate on the overhead amount of bits used for encoding blocks. Specifically, if the 8×8 block with a Huffman code length of 1 bit occurs, say, 50% of the time on average and it will be encoded with 2 bits (Case 1a), then the expected compression size for that block will double. Also, 7 overhead bits are used for the remaining Huffman codes in the codebook (Case 1b), whereas ideally Huffman codes should solely be employed as per their purpose. While this encoding is correct, it seems

18 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

reasonable to pursue a more efficacious coding scheme for the proposed method, as explained subsequently. The reason why the preceding coding scheme in equation (15) incurs a considerable amount of overhead encoding information lays in the fact that RCRC interferes with the codebook coding. Consequently, the compressed bit stream contains an admixture of strings representing Huffman codes and strings representing the RCRC output per block. Thus, a distinction between such encoded bits needs to be made explicitly for the decoder to function correctly. The three cases covered above are sufficient and necessary to reconstruct the original binary image exactly, but they need to be fine-tuned in order to achieve more encoding efficiency. In light of how Huffman decoding works, we introduce two ‘flag’ signals for the alternative coding scheme of the proposed method. The first signal is the break-codebook-coding (BCC) signal, and the second is the incompressible-block (ICB) signal. These two flags are considered as members of the alphabet of 8×8 blocks and will be added to the constructed codebook with probabilities

p BCC

and

p ICB , and the Huffman algorithm will assign binary codes to both flags. The purpose of the BCC block is to mark the interruption of codebook encoding for an input 8×8 block and the commencement of the RCRC encoding for that block. If RCRC encodes the block in less than 64 bits, then the compressed bit stream for that block will consist of the Huffman code of BCC and the RCRC output. If the block is incompressible, then the 64 bits are preceded by the Huffman code of ICB. Decoding is straightforward. If the decoder encounters the BCC code, then it will traverse the tree to decode flag block BCC. That blocks calls for an interruption of Huffman tree decoding and the decoder turns to RCRC decoding. If the ICB code is encountered, then the flag block ICB informs the decoder to read the subsequent 64 bits in the compressed bit stream. The probabilities

p BCC

and

p ICB

determine the Huffman codes for the two flag blocks. These

values are assigned empirically based on the average rate of RCRC compression and the average percentage of incompressible blocks for large data sets. For binary images, we observed that, on average, 93% of the blocks are compressed by the dictionary. Out of the remaining 7% of blocks, 5% are

19 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

compressed by RCRC and 2% remain uncompressed. The constructed codebook for this scheme has 6954 entries. This alternative coding scheme has some apparent advantages over the coding scheme posited earlier. First, it eliminates Case 1a as it does not require two overhead bits to encode the codebook block with the shortest Huffman code. Second, it eliminates the 7 overhead bits of Case 1b that are used for the remaining Huffman codes. We observed that the empirical occurrence of the blocks covered by cases 1a and 1b was larger than the blocks compressed by RCRC and the incompressible blocks, on average. Therefore, the alternative coding scheme is expected to yield better compression ratios for binary images, on average. Per the model in equation (15), we provide a model for the alternative coding scheme:

E *p [ξ ] =

∑ p(bC ) ⋅ L(bC )

(Case 1)

b∈C , b∉{BCC , ICB}

+

∑ p(bR ) ⋅ [ L( BCC ) + L(b)]

(Case 2)

(16)

b∈R

+ [ L( ICB) + 64] ∑ p(bU ) ,

(Case 3)

b∈U

where

E *p [ ⋅ ]

is the expectation operator for the alternative coding scheme, and

* [ξ ] gives the EP 64

compression in bits per pixel. The coding models given by equations (15) and (16) will be compared when empirical results are summarized in Section 3.

E.

Arithmetic Coding The alternative encoding scheme given by equation (16) motivated us to use the codebook model

along with the empirical probabilities of 8×8 blocks to compress via Arithmetic coding. In this case, the codebook contains the 6954 blocks (including the two flag symbols discussed above) along with the lower

20 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

and upper probability values. Technically, we implemented an integer-based arithmetic coder [17]. Encoding and decoding work the same way as explained in the preceding section.

IV.

APPLICATIONS

In this section, we report empirical results of the proposed compression method on binary and discretecolor images. Comparisons with JBIG2 or other reported results are provided here applicable. The main reason why we compare results for binary image compression only with JBIG2 – despite the fact that both methods are lossless – lies in that the standard JBIG2 is viewed as a generic compression scheme in much the same way as we claim ours to be. Nevertheless, we note that for specific classes of binary and discretecolor images, ad-hoc compression methods have been proposed and successfully implemented. The JBIG2 we used is jbig2enc that is available on http://www.cvisiontech.com/pdf/pdf-compression/jbig2-encoder4.html?lang=eng. A. Binary Images We tested the proposed method on a variety of more than 100 binary images collected from different sources. These images were not part of the frequency analysis process. This new image sample we compiled comprises various topological shapes ranging from solid shapes to complex, less regular geometries. The empirical results presented here are classified in three categories: solid binary images, irregular geometries, and images JBIG2 compresses more efficiently than the proposed method. These images are exhibited in Figure 6. In all three cases, a selected set of binary images is given along with compression ratios of the proposed method using Huffman and Arithmetic codes, and JBIG2. Table 2 displays the compression ratios for 15 solid binary images. In the case of Huffman codes, the alternative encoding scheme (exposed in Section 2.4) performs better than the other encoding scheme. On average, the proposed method outperforms the standard JBIG2 by approximately 3.06% in the case of Huffman codes when the alternative encoding scheme is employed, and 3.07% when Arithmetic coding is employed. As stated in Section 2.4, the alternative coding scheme is more efficacious than the other scheme. In all cases, the alternative encoding scheme,

E *p , outperforms the other encoding scheme, E p .

Finally, Table 3 shows the percentage of blocks compressed by the dictionary, the percentage of blocks

21 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

compressed by RCRC, and the portion of incompressible blocks. Observe that the portion of the latter is relatively small. This observation complies with the theoretical analyses on the codebook error. Table 2: Empirical results for 15 selected binary images: solid shapes Image

Dimensions

059 071 074 075 076 077 079 080 081 082 083 085 086 087 090

200x329 545x393 203x247 790x480 245x226 450x295 245x158 491x449 245x248 491x526 354x260 167x405 335x500 447x459 350x357 Average

Ep

E *p

AC

JBIG2

88.99 90.72 86.9 92.78 86.24 88.47 85.35 91.86 89.2 92.21 88.93 86.9 91.12 89.89 86.68 90.36

93.72 93.75 92.66 96.5 92.75 95.65 91.42 95.71 92.84 96.33 95.29 92.55 95.88 96.2 95.02 95.26

94.58 94.22 93.16 96.25 93.41 95.38 91.96 95.86 93.3 96.08 95.48 93.62 95.62 95.73 94.8 95.27

88.58 89.82 87.68 94.8 86.82 94.83 84.91 92.17 86.69 94.31 92.24 87.7 94.97 93.86 92.18 92.43

Figure 6: Binary images reported in the empirical results

22 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

Table 3: Percentage of blocks compressed by the codebook, RCRC, and incompressible blocks Image Codebook RCRC Incompressible 059 96.7 3.3 0 071 95.48 4.32 0.2 074 95.04 4.47 0.5 075 98.53 1.46 0.02 076 95.44 4.56 0 077 98.62 1.23 0.14 079 94.19 5.65 0.16 080 98.73 1.25 0.03 081 94.96 4.84 0.2 082 98.9 1.03 0.07 083 98.52 1.28 0.2 085 95.42 4.39 0.19 086 98.49 1.4 0.11 087 99.38 0.52 0.09 090 98.08 1.87 0.05 Table 4 shows empirical results for 15 less regular binary images. These images are not as solid geometries as the images given in Table 2. On average, the proposed method performed better than JBIG2 by 4.32% when Huffman codes with the alternative encoding scheme are used and 5.62% when Arithmetic coding is employed. Moreover, Table 5 shows the percentage of blocks compressed by the dictionary, the percentage of blocks compressed by RCRC, and the portion of incompressible blocks. Table 4: Empirical results for 15 selected binary images: irregular shapes

Image

Dimensions

004 005 009 012 014 016 018 019 021 029 034 042 056 057 084

512x800 1024x768 1061x1049 575x426 498x395 400x400 483x464 791x663 360x441 196x390 372x217 490x481 450x360 180x210 240x394 Average

Ep

E *p

AC

JBIG2

93.25 88.35 87.9 90.85 88.4 77.45 89.67 91.68 88.49 84.86 81.71 86.04 85.44 83.56 87.85 88.44

95.99 92.89 92.31 94.45 92.74 81.9 93.73 94.84 90.39 89.17 85.56 89.92 91.16 89.31 92.52 92.45

95.73 93.92 94.05 95.29 93.69 86.17 94.78 95.3 90.98 91.62 87.16 90.96 92.69 91.35 92.63 93.6

94.69 89.06 88.44 92.83 87.58 74 89.8 91.81 85.94 81.59 80.1 84.99 86.86 80.17 89.32 88.62

23 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

Table 5: Percentage of blocks compressed by the codebook,

Image 004 005 009 012 014 016 018 019 021 029 034 042 056 057 084

RCRC, and incompressible blocks Codebook RCRC Incompressible 97.38 2.54 0.08 95.92 3.55 0.54 95.82 3.89 0.29 96.35 3.65 0 95.65 4.03 0.32 83.89 15.3 0.81 97.03 2.92 0.06 97.42 2.52 0.06 91.5 7.26 1.24 92.98 6.37 0.65 85.03 14.36 0.61 90.67 8.99 0.34 93.59 6.14 0.27 92.59 7.09 0.32 93.74 6.19 0.06

Table 6 summarizes results on the 8 binary images wherein JBIG2 outperformed either the Huffman coding, or the Arithmetic coding, or both components of the proposed method. On average, JBIG2 scored 0.51% higher compared to Huffman coding and 0.92% higher than the Arithmetic coding aspects of the proposed method. Moreover, Table 7 shows the percentage of blocks compressed by the dictionary, the percentage of blocks compressed by RCRC, and the portion of incompressible blocks.

Table 6: Empirical results for 8 selected binary images

Image

Dimensions

008 022 028 064 088 094 096 100

2400x3000 315x394 2400x1920 640x439 1203x1200 1018x486 516x687 765x486 Average

Ep

E *p

AC

JBIG2

92.56 84.38 94.37 94.59 91.12 92.43 93.6 95.54 93.05

97.59 89.66 97.73 96.77 95.88 96.42 97.08 97.54 97.33

96.98 93.32 97.47 96.93 95.3 96.32 97.06 97.59 96.93

98.34 91.94 98 96.84 95.94 96.57 97.9 97.71 97.83

24 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

Table 7: Percentage of blocks compressed by the codebook,

Image 008 022 028 064 088 094 096 100

RCRC, and incompressible blocks Codebook RCRC Incompressible 99.81 0.18 0.01 93.05 6.85 0.1 99.65 0.34 0.01 98.14 1.77 0.09 97.49 2.34 0.17 98.49 1.45 0.06 99.21 0.79 0 99.18 0.8 0.02

In addition, Table 8 exposes empirical results for six 512x512 binary images comprising mainly lines rather than filled regions. 8×8 blocks extracted from such images may be classified as antagonists to the proposed method because of their topological irregularity. In addition, JIBG2 has been reported to compress efficiently topological objects enclosed by borderlines. On average, the proposed method performs better than JBIG2 by 2.42% for Huffman codes and 2.48% for Arithmetic coding. Table 8: Empirical results for 8 selected binary images

Image 101 102 103 104 105 106 Average

Ep

E *p

AC

JBIG2

87.47 93.01 85.56 85.85 89.66 88.89 88.41

89.09 94.91 87.4 87.51 91.41 90.55 90.14

89.04 94.82 87.99 87.87 91.11 90.29 90.19

87.68 94.7 83.92 84.82 88.63 88.3 88.01

B. Discrete-Color Images In addition to binary images, we tested the proposed scheme on two sets of discrete-color images. The first set consists of a sample of topographic map images consisting of four semantic layers that were obtained from the GIS lab at the University of Northern British Columbia [17]. Semantic layers include contour lines, lakes, rivers, and roads, all colored differently. Figure 7 shows the layer separation of a topographic map from our test sample. In this case, the map image has four discrete colors (light blue, dark blue, red, and olive), and thus four layers are extracted. Each discrete color is coupled with the background color (in this case being white) to form the bi-level layers.

25 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

Figure 7: Example of color separation: a topographic map of a part of British Columbia containing 4 different colors excluding the white background. The second set of discrete-color images consists of graphs and charts, most of which were generated with Microsoft Excel and other similar spreadsheet programs. Graphs and charts consist of discrete colors and are practically limited to no more than 60 colors. Such data are extensively used in business reports, which are in turn published over the web or stored in internal organizational databases. As the size of such reports increases, lossless compression becomes imperative in the sense that it is costeffective. For color separation, we wrote a code to create a different plan that contains same color pixels in the image against white background. These plans are then treated as binary images and their compression ratios are combined to produce the result for the composite image. Table 9 provides a summary description of three map images used in this process. Table 10 shows the compression results in bits per pixel (bpp) of the proposed method for the three map images shown in Table 9. From the results we may conclude that the proposed method achieves high compression on the selected set of map images. We note that our results are higher than JBIG2, which has been reported to compress at a rate of 0.22 to 0.18 bits per pixel [18]. On average, our method achieves a compression of 0.035 bpp on maps. This result is higher than or comparable to those results reported in [8]. Finally, results

26 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

for charts and graphs are given in Table 11. In this case, the average compression ratio achieved is 0.03 bpp The three topographic maps are exhibited in Figure 8. Table 9: Description of selected topographic map images, scale 1:20000

Map 1 2 3

Dimensions 2200 5776 5112 Total Size

Size (KB) 1700 13056 11600 406708

Table 10: Compression results for map images using the proposed method

Map 1 2 3 Total

Compressed Size (KB) 210.77 7489.27 6626. 27 14326.42

Compression Ratio (bpp) 0.019 0.034 0.038 0.035

JBIG2 (bpp) 0.029 0.052 0.055 0.053

Table 11: Compression results for charts and graphs using the proposed method

Chart

Size (KB)

boa1 boa2 boa3 boa4 boa4_1 google1 boa5 boa6 boa7 boa8 boa9 boa10 boa11 boa12 boa13 boa14 boa15 boa16 boa16_1 boa17 boa18 boa19 boa20 boa21 Total

1281.33 899.28 865.97 863.74 378.33 607.67 590.07 1039.81 590.07 590.07 773.49 839.71 590.07 590.07 590.07 590.07 590.07 367.01 590.07 590.07 1050.83 588.87 309.45 607.44 16373.62

Compression Ratio (bpp) 0.014 0.065 0.065 0.045 0.057 0.021 0.021 0.018 0.033 0.031 0.033 0.023 0.022 0.032 0.032 0.018 0.028 0.017 0.034 0.070 0.017 0.017 0.020 0.022 0.030

JBIG2 (bpp) 0.082 0.063 0.135 0.077 0.143 0.088 0.077 0.053 0.099 0.089 0.089 0.079 0.08 0.088 0.087 0.055 0.059 0.126 0.167 0.103 0.072 0.065 0.074 0.12 0.087

27 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

Figure 8: Selected topographic maps C. Huffman Coding is not dead The advent of Arithmetic Coding along with the gravity of many results reported in literature over the last two decades has brought about an overshadowing of the power and simplicity of Huffman Coding, but not without principal reasons [12] [20]. Arithmetic codes do not generally perform better than Huffman codes when incorrect probabilities are fed to the coder. Empirical results for the 100 solid images exposed in this section show that Huffman coding slightly outperforms Arithmetic coding on average (via 𝐸!∗ ). The rationale for this is that the probability model we constructed for the codebook may have predicted slightly lower or slightly higher relative frequencies for certain 8x8 blocks

which appeared with slightly higher or slightly lower

probabilities in specific test binary images. Therefore, Arithmetic coding performs less efficiently than Huffman coding. The case presented here posits a complex

situation involving an alphabet of 2!"

symbols and a less contiguous body of data, such as are binary images. We may, at least in principle, conclude that a more rigorous codebook model needs to be constructed in order to accommodate the strict modeling requirements posed by Arithmetic coding. One way to approach a more correct model would be to enlarge the data sample used to construct the codebook. This is, nevertheless, a daunting computational

28 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

task, as illustrated by the time (500 hours) it took to generate the codebook based on only 120 binary images. In addition to being susceptible to incorrect probabilities, arithmetic coding is generally slower than Huffman coding in terms of execution time for encoding and decoding [16] [21]. Improvements for increasing the speed of Arithmetic coding have brought about sacrifices for coding optimality [12]. In this work, we implemented an integer-based arithmetic coder based on the guidelines in [17]. For 200 binary images we used for coding, for instance, the overall execution times in seconds for the proposed scheme and the arithmetic coder were, respectively, 261 and 287. In terms of optimality, the Huffman codes we constructed for the codebook blocks have an absolute redundancy equal to 0.01, which implies that 0.252% more bits than the entropy suggests are required to code blocks in the codebook. The upper bound for the redundancy is p + 0.086; in our case, p = 0.503. Thus, we may conclude that the constructed Huffman codes are near-optimal. And given the proneto-incorrect-probabilities facet of Arithmetic coding as well as the high compression results of the proposed method, we conclude that for practical applications the proposed Huffman coding scheme should be the preferred compression choice. All in all, the whole purport of the theoretical and empirical observations posited in [12] [20] is that Huffman coding is generally more robust than Arithmetic coding, which can expose emphatic advantage in rare cases. The empirical results shown in this work suggest that the proposed codebook model works efficiently with Huffman codes, but yields slightly discrepant probabilities for the arithmetic coder to function effectively.

V.

CONCLUSIONS

In this paper, we have introduced an innovative method for lossless compression of discrete-color and binary images. This method has a low complexity and it is easy to implement. The experimental simulation results on more than 150 images show that the proposed method enjoys a high compression ratio that in many cases was higher than 95%. This method has been successfully implemented on two major image categories: (i) images that consist of a predetermined number of discrete colors, such as digital maps, graphs, and GIS images; and (ii) binary images. The results of a large number of test images show that our

29 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

method has compression ratio that are comparable or higher than the standard JBIG-2 by 5% to 20% for binary images, and by 2% to 6.3% for discrete color images. This method works best on mid-to-small size images such as those on the Internet.

REFERENCES [1]

El-Maleh, Aiman H., alZahir, Saif and Khan, Esam, "A Geometric-Primitives-Based Compression Scheme for Testing Systems-on-a-Chip." s.l. : IEEE Computer Society, 2001. p. 54.

[2]

Kunt, M. and Johnsen, O., "Block Coding of Graphics: A Tutorial Review." Proceedings of the IEEE, 1980, Issue 7, Vol. 68, pp. 770-786.

[3]

Mitchell, J. L., "Facsimile Image Coding." 1980. pp. 423-426.

[4]

Wu, X. and Memon, N., "Context-Based, Adaptive, Lossless Image Coding." IEEE Trans. Communications, 1997, Issue 4, Vol. 45, pp. 437-444.

[5]

Franti, Pasi and Nevalainen, O., "Compression of Binary Images by Composite Methods Based on Block Coding." Journal of Visual Communication and Image Representation, 1995, Issue 4, Vol. 6, pp. 366-377.

[6]

Franti, Pasi and Nevalainen, O., "A Two-Stage Modeling Method for Compressing Binary Images by Arithmetic Coding." The Computer Journal, 1993, Issue 7, Vol. 36, pp. 615-622.

[7]

Franti, Pasi., "A Fast and Efficient Compression Method for Binary Images." Signal Processing: Image Communication (Elsevier Science), 1994, Vol. 6, pp. 69-76.

[8]

Fränti, Pasi, Eugene Ageenko, Pavel Kopylov, Sami Gröhn, and Florian Berger, “Compression of map images for real-time applications”, Elsevier, Image and Vision Computing vol. 22, pp. 1105– 1115, 2004

[9]

Kopylov, P. and Fr "anti, Pasi., "Compression of Map Images by Multilayer Context Tree Modeling." IEEE Trans. on Image Processing, 2005, Issue 1, Vol. 14, pp. 1-11.

[10]

Podlasov, A. and Fr "anti, Pasi., "Lossless Image Compression via Bit-Plane Separation and Multilayer Context Tree Modeling." Journal of Electronic Imaging, 2006, Issue 4, Vol. 15, p. 043009.

30 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

[11]

Sayood, Khalid., Introduction to Data Compression. 3. s.l. : Morgan Kaufmann Publishers Inc., 2005.

[12]

Bookstein, A. and Klein, Shmuel T., "Is Huffman Coding Dead?." Computing, 1993, Issue 4, Vol. 50, pp. 279-296.

[13]

Feller, William., An Introduction to Probability Theory and Its Applications. 3. s.l. : Wiley, 1968. p. 509. Vol. 1.

[14]

Howard, P. G. and Vitter, J. S., "Fast and Efficient Lossless Image Compression." s.l. : IEEE, 1993. pp. 351-360.

[15]

Moffat, Alistair, Storer, J. A. and Reif, J. H. Alistair Moffat. Storer and Reif and Two-level context based compression of binary images.. ISBN 0-8186-92022, Computer Society Press, pages 382–391, 1991.

[16]

Zobel, J. and Moffat, Alistair., "Adding Compression to a Full-Text Retrieval System." Software -Practice and Experience, 1995, Issue 8, Vol. 25, pp. 891-903.

[17]

Howard, Paul G. and Vitter, Jeffrey S., "Practical Implementations of Arithmetic Coding." [ed.] J. A. Storer. s.l. : Kluwer Academic Publishers, 1992. pp. 85-112.

[18]

Northern, University of., "Topographic Map of British Columbia." Topographic Map of British Columbia. 2009.

[19]

Franti, Pasi, Kopylov, Pavel and Ageenko, Eugene., "Evaluation of Compression Methods for Digital Map Images." [ed.] M. H. Hamza, O. I. Potaturkin and Y. I. Shokin. 2002.

[20]

alZahir, Saif and Borici, Arber., "A Fast Lossless Compression Scheme for Digital Map Images Using Color Separation." s.l. : IEEE Computer Society, 2010. pp. 1318-1321.

[21]

Wuertenberger, A., Tautermann, C. S. and Hellebrand, S., "Data Compression for Multiple Scan Chains Using Dictionaries with Corrections." 2004. pp. 926-935.

[22]

Wohl, P., et al., "Efficient Compression of Deterministic Patterns into Multiple PRPG Seeds." 2005. pp. 916-925.

[23]

Howard, Paul G., et al., "The Emerging JBIG2 Standard." IEEE Trans. Circuits and Systems for Video Technology, 1998, Vol. 8, pp. 838-848.

31 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2363411, IEEE Transactions on Image Processing

Professor Saif alZahir received his PhD and MS degrees in Electrical and Computer Engineering from the University of Pittsburgh, Pittsburgh, Pennsylvania, USA and the University of Wisconsin – Milwaukee, Milwaukee, Wisconsin, USA respectively. Dr. alZahir is involved in research in the areas of image processing, networking, VLSI, corporate governance and ethics. In 2003, The Innovation Council of British Columbia, Canada named him British Columbia Research Fellow. He has authored or co-authored more than 100 journal, conference papers, 2 books, and 3 book chapters.. He is the editor-in-chief of the International Journal of Corporate Governance, editor in several journals’ editorial boards, and member in several International Program Committees for conferences. Currently, he is a full professor with the Computer Science Department at UNBC, British Columbia – Canada.

Mr. Arber Borici received his MSc in Computer Science from the University of Northern British Columbia in 2011. Currently, he is working on his PhD at the University of Victoria, BC, Canada. Arber has published about 10 publications in reputable journals and international conferences.

32 (c) 2013 Crown Copyright. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

An innovative lossless compression method for discrete-color images.

In this paper, we present an innovative method for lossless compression of discrete-color images, such as map images, graphics, GIS, as well as binary...
8MB Sizes 3 Downloads 5 Views