David Huffman developed a form of encoding that creates the most efficient set of prefix codes for a given text. The ease with which Huffman codes can be created and used makes this still an extremely popular tool for compression code.
Paper on "In-place calculation of minimum-redundancy codes" by Alistair Moffat ,& Jyrki Katajainen. Tells how to deal with Huffman codes in the efficient way.
Created: 16/05/2007
More...
Library of technical articles along with code samples written and supported by Andrew Polar. Contains articles and source codes on Huffman and range coder. Readers may found another topics of interest not related to data compression, e.g. a simple Web server in sources.
Created: 11/02/2007
by Maxim SmirnovMore...
David A. Huffman, the founding faculty member of the Computer Science Department and a pioneer in the field, died at a local hospital on Thursday, October 7, after a 10-month battle with cancer. He was 74.
Created: 16/10/1999
by Mark NelsonMore...
This paper describes a few searching algorithms to be used on compressed files. An AC type algorithm for searching in Huffman encoded files, a KMP type algorithm for LZ derivatives and a BM type algorithm for searching in files compressed by Byte-Pair-Encoding (BPE). BPE gave results upto 3 times faster than those achieved by agrep.
Created: 12/06/2005
by Sachin GargMore...
Professor David A. Huffman (August 9, 1925 - October 7, 1999) - author of "Huffman Codes".
One of the faculty pages at UC Santa Cruz, the late David A. Huffman.
See also David_A._Huffman Wikipedia article for more information.
Created: 19/11/1999
by Mark NelsonMore...
Will McKee wrote some Huffman code in C++. Take a look.
Update: Will reports that he has improved the documentation in this package, as well as adding a new function.
Created: 09/12/2002
by Mark NelsonMore...
A group of statistical coders from Charles Bloom. This includes several different entropy encoders, including Huffman, Adaptive Huffman, Shannon-Fano, CACM Arithmetic coding, and a Skew Coder.
Created: 06/11/1999
by Mark NelsonMore...
Arturo Campos describes one of the simplest yet often effective compression algorithms. Believe it or not, Huffman coding with an order-0 model is nearly 50 years old!
Created: 17/11/1999
by Mark NelsonMore...
A page by Michael Schindler that describes Huffman coding in a fair amount of detail. Also includes links to other information resources. This page also has an explanation of the canonical huffman table storage algorithm.
Created: 27/11/1999
by Mark NelsonMore...
by P. G. Howard and J. S. Vitter. This paper shows how images can be encoded and decoded using parallel processing. Both Huffman and arithmetic coding are examined.
Created: 28/07/2002
by Mark NelsonMore...
This page contains a paper "Improved Huffman coding using recursive splitting" that describes a program that attempts to improve on Huffman compression by manipulation of the data stream.
Created: 09/12/1999
by Mark NelsonMore...
Daniel Ricardo has a new algorithm for generation of Huffman codes. He claims superior performance and small memory footprint. Algorithm description here, but no code.
Created: 22/09/2001
by Mark NelsonMore...
A careful analysis of degenerate Huffman trees created when input counts follow a Fibonacci sequence, by Alex Vinokur. This post is from sci.math.
Created: 14/02/2002
by Mark NelsonMore...
A Huffman file compressor for *NIX platforms. The author admits that the compression is perhaps not the best, but says it's going to be just right for embedded programming because of its miniscule footprint and high speed.
Version 1.01 was shipping in June, 2003.
Created: 12/06/2003
by Mark NelsonMore...
Software implementing a complete DMC codec, plus code for a couple of different arithmetic encoders, and a linear time Huffman tree builder.
This program implements Dynamic Markov Compression (DMC) as described in
"Data Compression using Dynamic Markov Modelling",
by Gordon Cormack and Nigel Horspool
in Computer Journal 30:6 (December 1987). The Guazzo arithmetic coder is used here.
Created: 08/11/1999
by Mark NelsonMore...
A very nice description of Huffman coding, as well as a few other types of coding. I believe this is part of a survey paper by Debra A. Lelewer and Daniel S. Hirschberg.
Created: 15/12/1999
by Mark NelsonMore...
Xiaolin Wu's static Huffman coding version of this program. Free of charge for research and non-commercial use. A total of 1.2 MBytes in a dozen or so files.
Created: 13/11/1999
by Mark NelsonMore...
Arturo Campos descrives Canonical Huffman Coding, the technique used in the deflate algorithm made popular by PKZip. This type of Huffman coding follows some specific rules regarding the structure of the Huffman tree that simplify the process of transmitting the tree.
Created: 17/11/1999
by Mark NelsonMore...
bwtzip is an ongoing project, distributed under the GNU General Public License, to implement a Burrows-Wheeler compressor in standard, portable C++. It is research-grade in that it is highly modularized and abstracted, so that it is simple to swap out parts of the compressor without affecting anything else. This makes it easy to experiment with different algorithms at different stages of compression.
Looks like Steven T. Lavavej released a new version of bwtzip in early February, 2003. A wide variety of improvements, most of them in implementation - not visible to the end user. A description of recent changes is found here
Created: 11/12/2002
by Mark NelsonMore...
This site discusses a characteristic of some compression algorithms that the author refers to as One to One (bijective) compression. In a nutshell, this property means that for any file X, F( F'( X ) ) == X. (F is either the compressor or decompressor, and F' is its opposite number.) This is definitely not the case for most conventional compression algorithms.
This page has a some Huffman compression code that has been adapted to implement a bijective property.
Created: 17/12/1999
by Mark NelsonMore...
It's a page on Huffman and Shannon-Fano methods on the Russian compression.ru site. It contains a number of papers and sources, and the major part is in English. It's unlikely that you will have any problems with downloading, since the titles are in English.
Created: 21/03/2005
by Maxim SmirnovMore...
This page is designed made to teach people about Lossless compression algorithms through the use of text graphics and Java Applets! Dominik Szopa has created pages that demonstrate Huffman, Adaptive Huffman, and LZW compression.
DCL reader SF has this to say: While the site itself is rather quick, it's disorganized...the Java applets really don't show what's going on at all. They show only the external effects...This site has definate potential, and I do recommend people see it. However, it's also got a ways to go yet. .
Created: 24/12/1998
by Mark NelsonMore...
This version of file encoder and decoder program is based on the Huffman coding method. It explicitly demonstrates the details of the files during the encoding and decoding. The algorithm is encapsulated in a class En_Decode in standard C++.
Created: 06/06/2004
by Mark NelsonMore...
The Basic Compression Library is a set of open source implementations of several well known lossless compression algorithms, including RLE, Huffman, LZ77 and Rice, written in portable ANSI C. The library has been created to be flexible and easy to understand. It is well documented, and easy to use and adapt to specific situations, such as custom compression methods and embedded systems.
Satisfied user Todd W said: I needed a simple set of compression routines for use in an embedded system. I must be able to store a fair amount of information in a small EEPROM as a generic database. The Huffman coder works very well in the application and has met my needs exactly! Very nice!
Created: 14/05/2004
by Mark NelsonMore...
BAR Archiver is a free, cross-platform file compression and archiving tool written in Java. It outperforms most of the popular file
compression tools available as commercial utilities.
Compression is based on Burrows Wheeler Transformation and modified Huffman entropy coder. The compression ratio is in par
with most of the commercial grade compression utilities.
Created: 11/04/2004
by Mark NelsonMore...
libhuffman is a Huffman encoder/decoder library and a command line interface to the library. The encoder is a 2 pass encoder. The first pass generates a huffman tree and the second pass encodes the file. The decoder is one pass and uses a huffman code table at the beginning of the compressed file to decode the file.
Beta 3 shipped in October, 2003.
Created: 27/10/2003
by Mark NelsonMore...
Michael Dipperstein describes his personal quest for understanding an implementation of Huffman coding. Full source included.
The page was updated with new source December, 2002.
Created: 18/10/2003
by Mark NelsonMore...
An article by John Pliam. Needs a better summary, but I don't quite get it, other than the fact that there is some cryptographic thinking going on here.
Created: 04/05/2003
by Mark NelsonMore...
The first time I ever heard the phrase Canonical Huffman Coder was in reference to the technique used in PKZip to store Huffman tables. I don't know where the technique originated, but it is basically a way to construct a Huffman table so that the actual codes don't have to be stored when storing the table. This makes for some nice space savings when compared to a first-pass naive implementation. (Like the ones I've done in the past.)
It turns out that somebody named Gareth was attempting to implement this code but was having a bit of trouble. His post to comp.compression brought out some useful help from some of the newsgroup regulars, and did a lot to shed light on the topic, and includes a reference to the paper that actually gave birth to the concept.
Created: 13/03/2003
by Mark NelsonMore...
Will McKee has released this as freeware - includes complete source to a string substitution compressor. From the description it sounds as though it's variant on LZSS, but I'll defer to anyone willing to do a real analysis.
Created: 09/12/2002
by Mark NelsonMore...
This dynamic Huffman coder from Karl Malbrain is written in C and includes weight scaling. It is modeled on the Vitter algorithm.
A DataCompression.info user notes that this site has been undergoing continual changes, and perhaps would benefit from some sort of "last modified on" field.
Created: 20/11/2002
by Mark NelsonMore...
The abstract for a paper on calculation of Huffman codes. The paper isn't here, but the source code is. Alistair says that if you sort your array of counts, you can create the Canonical Huffman tree in memory.
Created: 31/10/2002
by Mark NelsonMore...
Some simple BASIC routines to compress data without extravagant data or code space. The author seems to indicate this isn't Huffman coding, but doesn't say what it is.
Created: 28/09/2002
by Mark NelsonMore...
The UPL Compression Library is a high-performance professional compression library. It offers the ability to compress and decompress data, buffers, strings or single files and features the latest innovations in data compression. The library offers eight extremely powerful compression algorithms. Dynamic Huffman, Arithmetic, BWT, Ppm and several Lempel Ziv flavors.
DataCompression.info user John G. had this to say: I was looking for adding a better compression to my Visual Basic project and it worked like a charm. The compression ratio is really good, better than Zip!
Created: 27/08/2002
by Mark NelsonMore...
An overview of the basics, including Shannon-Fano, Huffman, Arithmetic coding, and a section on LZW for good measure.
Created: 20/07/2002
by Mark NelsonMore...
Graham Fyffe proposes an alternative type of variable length coding that he claims will offer improved efficiency. Full description, no implementation.
Created: 18/07/2002
by Mark NelsonMore...
This article is really about using the priority queue containers that are part of the standard C++ library. The example program implements a Huffman Encoder using the queues, showing how they can do a fairly complex piece of work without too much coding on your part.
Created: 12/04/2002
by Mark NelsonMore...
shcodec is order-0 32-bit canonical static huffman codec. It encodes an alphabet of 256 symbols with minimum-redundancy or length-restricted codes (basic method: Alistair Moffat and Jyrki Katajainen, modified by Artur A. Pessoa). shcodec uses efficient method for tree packing: on text files packed tree size is approx 68 bytes, on binary files this value is about 132 bytes. Memory requirements are very small: 1280 bytes for encoding and only 574 bytes for decoding! shcodec uses extremely fast and simple SHIFT-OR method for encoding, and CANONICAL-DECODE with a cache for small codewords for decoding.
Update: Alexander has added SH-SFX to the web page - a program for creating Win32 SFXs from files compressed with shcodec.
Created: 04/04/2002
by Mark NelsonMore...
An overview of Shannon Fano coding from the folks at the DataCompression Reference Center.
This link points to an archived site, as the original has disappeared. Links on the archived page may or may not work properly.
Created: 30/08/2000
by Mark NelsonMore...
by Jeffrey N. Ladino. His description of the page: This page is a research project in the field of data compression algorithms. It is intended to be an informative overview for a beginner in the field of computer science. Mostly talks about lossless compression, with an explanation of Huffman coding.
Created: 01/08/2000
by Mark NelsonMore...
A straightforward discussion of Huffman compression. Written at a level so that it can be understood without much experience in data compression or programming.
Created: 14/07/2000
by Mark NelsonMore...
Advertises itself as "A dirty but free implementation of a huffman encoder/decoder in Java." Not completely free, it is covered by the GLPL, and naturally includes fully documented source.
Created: 23/03/2000
by Mark NelsonMore...
A nice tidy proof that the Huffman tree is an optimal prefix code for a given message. Note that a library reader finds that there is an error in the fixed-length codes listed in Table 1, a and b have duplicate values.
Created: 16/12/1999
by Mark NelsonMore...
The Data Compression Reference Center talks about Huffman coding. A short but fairly succint explanation.
This link points to an archived site, as the original has disappeared. Links on the archived page may or may not work properly.
Created: 15/12/1999
by Mark NelsonMore...
This page discusses the various compression schemes used by EGI, Shannon-Fano, BWT, and RLE. EGI is a player of animation sequences.
Created: 14/11/1999
by Mark NelsonMore...
An in-place Huffman code length calculation demonstration from Alistair Moffat. The abstract of that paper may be fetched from this location.
Created: 28/10/1999
by Mark NelsonMore...
ACIS, the AXAF CCD Imaging Spectrometer, is an instrument being built by a team from the Massachusetts Institute of Technology's Center for Space Research and the Pennsylvania State University for the Chandra X-ray Observatory (formerly, AXAF), scheduled for launch in 1999.
This is a good paper discussing practical implementation of data compression in a real world project with interesting parameters.
Created: 04/01/1999
by Mark NelsonMore...