Arithmetic coding is a method of encoding data using a variable number of bits. The number of bits used to encode each symbol varies according to the probability assigned to that symbol. Low probability symbols use many bits, high probability symbols use fewer bits.
So far, this makes Arithmetic Coding sound very similar to Huffman coding. However, there is an important difference. An arithmetic encoder doesn't have to use an integral number of bits to encode a symbol. If the optimal number of bits for a symbol is 2.4, a Huffman coder will probably use 2 bits per symbol, whereas the arithmetic encoder my use very close to 2.4. This means an arithmetic coder can usually encode a message using fewer bits.
The method by which this is accomplished is somewhat complex, and is explained in some of the links shown below
Alternative to arithmetic coding - instead of dividing a range into two subranges, we distribute them uniformally over the range. See http://cs.fit.edu/~mmahoney/compression/#fpaq0 for implementations.
Created: 20/11/2007
More...
A new incremental algorithm for data compression is presented. For a sequence of input symbols algorithm incrementally constructs a p-adic integer number as an output. Decoding process starts with less significant part of a p-adic integer and incrementally reconstructs a sequence of input symbols. Algorithm is based on certain features of p-adic numbers and p-adic norm. p-adic coding algorithm may be considered as of generalization a popular compression technique - arithmetic coding algorithms. It is shown that for p = 2 the algorithm works as integer variant of arithmetic coding; for a special class of models it gives exactly the same codes as Huffman's algorithm, for another special model and a specific alphabet it gives Golomb-Rice codes.
Created: 12/04/2007
More...
Library of technical articles along with code samples written and supported by Andrew Polar. Contains articles and source codes on Huffman and range coder. Readers may found another topics of interest not related to data compression, e.g. a simple Web server in sources.
Created: 11/02/2007
by Maxim SmirnovMore...
Nice article by Andrew Polar on arithmetic and range coders. It can be read even by those not very closely acquainted with data compression. The article contains source code attached. The additional merit of this text is a discussion on several patent-relating issues.
Created: 11/02/2007
by Maxim SmirnovMore...
Paper and source code of an efficient binary adaptive encoder of iid sequences.
Outperforms several existing arithmetic-codes-based implementations (JBIG's Q-coder, H.264's CABAC) on short sequences.
Created: 22/01/2007
More...
Boris Ryabko is a well-known Russian scientist with main scientific interests in Information Theory, Prediction, Complexity of Algorithms,
Cryptography and Mathematical Biology. Probably, he is most known as an inventor of "bookstack" or "move-to-front" coding. There is a number of selected publications and reports regarding Information Theory and compression issues on his homepage. The primary language is English, some papers have Russian version.
Created: 14/10/2006
by Maxim SmirnovMore...
Java source code for range coder based upon the carry-less range coder implementation by Dmitry Subbotin, using 64-bit variables for improved performance. Along with a generic range coder and decoder, it contains a byte stream order-zero model implemented as subclasses of Java's I/O streams.
Created: 03/03/2006
by Sachin GargMore...
A concise explanation of Arithmetic Coding by Arturo Campos. The easy to read article is accompanied by snippets of pseudo code.
Created: 17/11/1999
by Mark NelsonMore...
An article that explains arithmetic coding, plus a sample program that implements a limited sort of PPM.
Christable C. had these kind words: I have read some other articles, but not clearly known. When reading this article, I find that Arithmetic Coding is easy to know !
Created: 07/11/1999
by Mark NelsonMore...
The range encoder is a fast multisymbol entropy coder (similar to arithmetic coding) with GNU general public license (other licenses on request). Its compression is within 0.01% of arithmetic coding. It is based on an article dated 1979, so it is believed to be patent free. This page includes a PS format paper by G.N.N Martin describing the range encoder.
Created: 28/10/1999
by Mark NelsonMore...
Alistair Moffat has put together all the links to his source code and articles on Arithmetic Coding in one tidy place.
Created: 19/11/1999
by Mark NelsonMore...
Gordon is the author of DMC and many publications related to data compression. This is is home page, which has links to much of his work.
Created: 08/11/1999
by Mark NelsonMore...
Arturo Campos describes a version of arithmetic coding which renormalizes in bytes, thus achiving twice the speed of an standard implementation and 0.01% less compression.
Created: 17/11/1999
by Mark NelsonMore...
Arturo Campos describes a model for arithmetic coding which results in less compression than an adaptative one, but at much higher speeds.
Created: 17/11/1999
by Mark NelsonMore...
A back-end implemenation of arithmetic coding for JPEG as defined in the standard. It is distributed as an add-on that can be used with the Independent JPEG groups library. The work of Guido Vollbeding.
Created: 09/03/2001
by Mark NelsonMore...
Version 1.1 of the lossless data compression toolkit by Nico deVries. The C sources in this toolkit include an LZW compressor, AR002 archiver, a PPM like compressor using arithmetic compression, Huffman compressor, splay tree program, and LZRW1. Quite a variety.
Created: 14/11/1999
by Mark NelsonMore...
A group of statistical coders from Charles Bloom. This includes several different entropy encoders, including Huffman, Adaptive Huffman, Shannon-Fano, CACM Arithmetic coding, and a Skew Coder.
Created: 06/11/1999
by Mark NelsonMore...
by P. G. Howard and J. S. Vitter. This paper shows how images can be encoded and decoded using parallel processing. Both Huffman and arithmetic coding are examined.
Created: 28/07/2002
by Mark NelsonMore...
Software implementing a complete DMC codec, plus code for a couple of different arithmetic encoders, and a linear time Huffman tree builder.
This program implements Dynamic Markov Compression (DMC) as described in
"Data Compression using Dynamic Markov Modelling",
by Gordon Cormack and Nigel Horspool
in Computer Journal 30:6 (December 1987). The Guazzo arithmetic coder is used here.
Created: 08/11/1999
by Mark NelsonMore...
An updated and translated version of our German paper "Proseminar Datenkompression - Arithmetische Kodierung" from 2001. To the best of our knowledge, it is the first comprehensive paper that describes the whole way from the basic principles of AC up to a simple implementation, fully documented with C++ source code.
Created: 06/06/2004
by Mark NelsonMore...
The third in Michael's collection of pages explaining lossless compression algorithms. A nice tutorial accompanied by ANSI C source.
Created: 02/05/2004
by Mark NelsonMore...
Aliaksei Sanko makes a few improvements to the code in the original 1987 CACM article. His sample includes a templated producer and consumer.
Created: 01/05/2004
by Mark NelsonMore...
The latest in the series of multi-model compressors from Matt Mahoney. This improves on PAQ3n's remarkable Calgary corpus performance by an additional 12K, at some expense in speed. Takes a whopping 84MB at runtime!
Created: 26/10/2003
by Mark NelsonMore...
Matt Mahoney says that with recent improvements by Serge Osnach, PAQ3N does better on the Calgary Corpus than any other open source compressor.
Created: 18/10/2003
by Mark NelsonMore...
The NIST page on arithmetic coding from their Dictionary of Algorithms and Data Structures. A terse definition and a couple of links.
Created: 21/03/2003
by Mark NelsonMore...
Anand Jain discusses his new technique for arithmetic coding. An executable file is included, but no source code.
Created: 22/02/2003
by Mark NelsonMore...
Bob Carpenter has created a nice Java package that implements a PPM/arithmetic coding compression system. This page includes links to the source code, javadocs, and a fair amount of tutorial material. Very complete!
Created: 11/12/2002
by Mark NelsonMore...
An overview of the basics, including Shannon-Fano, Huffman, Arithmetic coding, and a section on LZW for good measure.
Created: 20/07/2002
by Mark NelsonMore...
Pegasus has a patented range coder that they license at no charge. This archive contains some C code that provides a sample implementation.
Created: 12/07/2002
by Mark NelsonMore...
This page gives an introduction to Arithmetic Coding and shows how to implement it using floats or integers. There is also a proof of the efficiency of the algorithms, along with visualization and Win32 binaries. This page is in English and includes links to material in both German and English.
DataCompression.info user Juergen Abel found the site quite good: Clear description, especially the explanation of the renormalisation part, full source code.
Created: 15/04/2002
by Mark NelsonMore...
by P. G. Howard and J. S. Vitter. ``Arithmetic Coding for Data Compression,'' Proceedings of the IEEE, 82(6), June 1994, 857-865. This paper describes arithmetic coding, and introduces a technique that uses table lookups to make the process more efficient.
Created: 08/04/2002
by Mark NelsonMore...
This paper by Paul Howard and Jeff Vitter goes over some of the basics of Arithmetic Coding, then outlines a coder that has increased efficiency by virtue of substituting table lookups for expensive arithmetic operations.
Created: 07/04/2002
by Mark NelsonMore...
The author of this code created this visualization executable for a seminar.
DCL reader Drew D. downloaded the code, executed it, and had this to say about it: The program is an executable for windows with a dll and some gif's. The program seems to require a screen size greater than 800x600 to view the fixed size window. The program is a bit cryptic to me since I only have a basic understanding of Arithmetic encoding but it offers nice step by step encoding with some statistical information. It also allows for the modifying of input types and reading from a file..
Created: 01/04/2002
by Mark NelsonMore...
A paper by Howard and Vitter, available in both PDF and PS formats. The paper provides an analysis of the effect that models and various implementations of arithmetic coding have on compression.
Created: 30/12/2001
by Mark NelsonMore...
A paper by Alistair Moffat describing an improvement on Peter Fenwick's method for maintaining cumulative probability tables. The pointer on this page leads to some source implementing said table.
Created: 12/11/2001
by Mark NelsonMore...
The term bijective as used by Scott means that for any given given file X you are guarantted that A( B( X ) ) == B( A( X ) ) == X, where A and B are a pair of bijectively matched programs. In this particular case, A and B are a compressor and decompressor that use arithmetic coding. Includes C++ source.
Created: 01/05/2001
by Mark NelsonMore...
The Data Compression Research Center has an overview of arithmetic coding on this page.
This link points to an archived site, as the original has disappeared. Links on the archived page may or may not work properly.
Created: 04/10/2000
by Mark NelsonMore...
Suzanne Bunton's PhD thesis. A recent post to comp.compression said that Bunton had used floating point math for arithmetic coding. I haven't verified this, but it sounds worth a look.
Created: 24/06/2000
by Mark NelsonMore...
IBM's implementation of arithmetic coding known as the Q-coder is a well-known piece of work. (Patented, unfortunately.) This paper discusses an implementation of the Q coder in hardware.
Created: 03/06/2000
by Mark NelsonMore...
Radford Neal is a professor at the University of Toronto. One of his research interests is data compression, and his name appears on a couple of seminal papers on arithmetic coding. Links to papers can be found here.
Created: 31/05/2000
by Mark NelsonMore...
The dissertation itself, in PS format, along with code used in the dissertation. The code implements k-ary arithmetic compression.
Created: 07/11/1999
by Mark NelsonMore...
Multiply-free Arithmetic Coding Implementation - Compression version 0.0.0
Gordon V. Cormack Feb. 1993
University of Waterloo cormack@uwaterloo.ca
This source code has some traditional freeware licensign terms embedded in the code
Created: 02/11/1999
by Mark NelsonMore...
The source code to the famous Witten, Neal, and Cleary 1987 CACM article on arithmetic coding. The paper is probably not legally on line anywhere, but can be found in the book Text Compression, as well as the journal. This FTP site has three different variations on the source.
Created: 08/12/1998
by Mark NelsonMore...