Lossless Data Compression: Table of Contents

Lossless Data Compression

Notation 11
1 Introduction 15
1.1 Codes and communication 15
1.2 Data compression 16
1.3 Outline of this thesis 18
2 Classical compression algorithms 21
2.1 Symbol codes 22
2.1.1 Huffman coding 23
2.2 Integer codes 24
2.2.1 Unary code 25
2.2.2 Elias gamma code 26
2.2.3 Exponential Golomb codes 27
2.2.4 Other integer codes 28
2.3 Dictionary coding 29
2.3.1 LZW 29
2.3.2 History and significance 30
2.3.3 Properties 32
2.4 Transformations 32
2.4.1 Move-to-front encoding 32
2.4.2 Block compression 33
2.5 Summary 34
3 Arithmetic coding 37
3.1 Introduction 37
3.1.1 Information content and information entropy 37
3.1.2 The relationship of codes and distributions 38
3.1.3 Algorithms for building optimal compression codes 39
3.2 How an arithmetic coder operates 40
3.2.1 Mapping a sequence of symbols to nested regions 40
3.2.2 Mapping a region to a sequence of symbols 42
3.2.3 Encoding a region as a sequence of binary digits 42
3.3 Concrete implementation of arithmetic coding 43
3.3.1 Historical notes 49
3.3.2 Other generic compression algorithms 49
3.4 Arithmetic coding for various distributions 50
3.4.1 Bernoulli code 50
3.4.2 Discrete uniform code 51
3.4.3 Finite discrete distributions 51
3.4.4 Binomial code 52
3.4.5 Beta-binomial code 54
3.4.6 Multinomial code 55
3.4.7 Dirichlet-multinomial code 56
3.4.8 Codes for infinite discrete distributions 57
3.5 Combining distributions 58
3.5.1 Sequential coding 58
3.5.2 Exclusion coding 59
3.5.3 Mixture models 60
4 Adaptive compression 61
4.1 Learning a distribution 61
4.1.1 Introduction 61
4.1.2 Motivation 62
4.2 Histogram methods 62
4.2.1 A simple exchangeable method for adaptive compression 62
4.2.2 Dirichlet processes 65
4.2.3 Power-law variants 65
4.2.4 Non-exchangeable histogram learners 66
4.3 Other adaptive models 67
4.3.1 A Pólya tree symbol compressor 67
4.4 Online versus header-payload compression 71
4.4.1 Online compression 71
4.4.2 Header-payload compression 73
4.4.3 Which method is better? 75
4.5 Summary 76
5 Compressing structured objects 79
5.1 Introduction 79
5.2 Permutations 82
5.2.1 Complete permutations 82
5.2.2 Truncated permutations 84
5.2.3 Set permutations via elimination sampling 84
5.3 Combinations 85
5.4 Compositions 86
5.4.1 Strictly positive compositions with unbounded components 87
5.4.2 Compositions with fixed number of components 87
5.4.3 Uniform K-compositions 88
5.4.4 Multinomial K-compositions 88
5.5 Multisets 89
5.5.1 Multisets via repeated draws from a known distribution 90
5.5.2 Multisets drawn from a Poisson process 91
5.5.3 Multisets from unknown discrete distributions 92
5.5.4 Submultisets 93
5.5.5 Multisets from a Blackwell–MacQueen urn scheme 93
5.6 Ordered partitions 95
5.7 Sequences 96
5.7.1 Sequences of known distribution, known length 97
5.7.2 Sequences of known distribution, uncertain length 98
5.7.3 Sequences of uncertain distribution 99
5.7.4 Sequences with known ordering 99
5.8 Sets 100
5.8.1 Uniform sets 100
5.8.2 Bernoulli sets 100
5.9 Summary 100
6 Context-sensitive sequence compression 103
6.1 Introduction to context models 104
6.1.1 Pure bigram and trigram models 104
6.1.2 Hierarchical context models 106
6.2 General construction 107
6.2.1 Engines 108
6.2.2 Probabilistic models 108
6.2.3 Learning 110
6.3 The PPM algorithm 111
6.3.1 Basic operation 111
6.3.2 Probability estimation 113
6.3.3 Learning mechanism 115
6.3.4 PPM escape mechanisms 115
6.3.5 Other variants of PPM 116
6.4 Optimising PPM 117
6.4.1 The effects of context depth 117
6.4.2 Generalisation of the PPM escape mechanism 117
6.4.3 The probabilistic model of PPM, stated concisely 122
6.4.4 Why the escape mechanism is convenient 122
6.5 Blending 123
6.5.1 Optimising the parameters of a blending PPM 124
6.5.2 Backing-off versus blending 124
6.6 Unbounded depth 128
6.6.1 History 128
6.6.2 The Sequence Memoizer 132
6.6.3 Hierarchical Pitman–Yor processes 132
6.6.4 Deplump 133
6.6.5 What makes Deplump compress better than PPM? 134
6.6.6 Optimising depth-dependent parameters 136
6.6.7 Applications of PPM-like algorithms 139
6.7 Conclusions 139
7 Multisets of sequences 143
7.1 Introduction 143
7.2 Collections of fixed-length binary sequences 144
7.2.1 Tree representation for multisets of fixed-length strings 145
7.2.2 Fixed-depth multiset tree compression algorithm 146
7.3 Collections of binary sequences of arbitrary length 148
7.3.1 Compressing multisets of self-delimiting sequences 148
7.3.2 Encoding string termination via end-of-sequence markers 151
7.4 Conclusions 153
8 Cost-sensitive compression and adversarial sequences 155
8.1 Cost-sensitive compression 156
8.1.1 Deriving the optimal target distribution 156
8.1.2 Optimising Morse code 157
8.2 Compression and sampling 161
8.3 Worst case compression and adversarial samples 161
8.3.1 Adversarial sequences 163
8.3.2 Some results 166
8.3.3 Discussion 179
Conclusions 181
A Compression results 185
Bibliography 207
Keyword index 224