Compression algorithms that are illustrated using source code. Note that if a product is a library or program, I generally don't include it here even if source is included.
bbb is a free, open source (GPL) command line file compressor by Matt Mahoney, Aug. 31, 2006. It uses a memory efficient BWT allowing blocks up to 80% of available memory. The transformed data is compressed with an order 0 PAQ like model: the previous bits of the current byte are mapped first to a bit history, then through a 6 level probability correcting adaptive chain before bitwise arithmetic coding.
Created: 29/01/2008
More...
Evince is a document viewer for the GNOME desktop environment. It currently supports pdf, postscript, djvu, tiff and dvi. The goal of evince is to replace the multiple document viewers that exist on the GNOME Desktop with a single simple application. Source code is available.
Created: 10/04/2008
by Maxim SmirnovMore...
OpenAVS is the open source implementaion of AVS (Advanced Coding of Audio and Video ). AVS is competing with MPEG-4 and H.264 to replace the current worldwide compression standard, MPEG-2. Please note that Chinese companies own majority of AVS patents.
Created: 24/03/2008
by Maxim SmirnovMore...
Alternative to arithmetic coding - instead of dividing a range into two subranges, we distribute them uniformally over the range. See http://cs.fit.edu/~mmahoney/compression/#fpaq0 for implementations.
Created: 20/11/2007
More...
libstree is a generic suffix tree implementation, written in C. It can handle arbitrary data structures as elements of a string. libstree is using the BSD license.
Created: 22/09/2007
More...
XWRT (XML-WRT) is a high-performance XML compressor (it also works with textual files). It transforms XML to a more compressible form and uses zlib (default), LZMA, PPMVC, or lpaq6 as a back-end compressor. It is similar to XMill, but has many improvements. Takes fourth and third places in LTCB text compression benchmark as for January 2008. The author is Przemyslaw Skibinski.
Created: 29/01/2008
More...
Avisynth's homepage: download, documentation, FAQ, plugins (including a list of external plugins), community.
Created: 18/01/2008
by Maxim SmirnovMore...
Contains the code and tells about BWTS which does not need the index. It's what BWT should have been to be length preserving.
Created: 10/01/2008
More...
Visual Studio 2005 projects for native ZLib and managed C# wrappers.
From the home page:
OpenNETCF provides a "port" of the zlib 1.2.3 library for Windows CE-based devices, including Pocket PCs, SmartPhones, Windows Mobile devices as well as the entire realm of generic Windows CE devices. We provide the full original source, modified slightly for the CE build environment, along with Microsoft Visual Studio 2005 Solution and Project files and compiled binaries for ARMv4 and ARMv4I for CE 4.2 and CE 5.0. We have also stripped out all of the makefiles and project trees that are not relevent to Windows CE development. If you are looking for those, we recommend that you visit the origin.
Created: 09/05/2007
More...
PeaZip is an archiver tool that supports its native Pea archive format (featuring compression, split volumes, and flexible authenticated encryption and integrity check schemes) and other mainstream formats, with special focus on handling open formats.
Create and extract 7Z, 7-Zip sfx, Bzip2, Gzip, PEA, split, TAR, and ZIP.
Browse and extract CAB, JAR, LZH, RAR and many more archive formats.
Created: 02/02/2007
More...
Library of technical articles along with code samples written and supported by Andrew Polar. Contains articles and source codes on Huffman and range coder. Readers may found another topics of interest not related to data compression, e.g. a simple Web server in sources.
Created: 11/02/2007
by Maxim SmirnovMore...
Nice article by Andrew Polar on arithmetic and range coders. It can be read even by those not very closely acquainted with data compression. The article contains source code attached. The additional merit of this text is a discussion on several patent-relating issues.
Created: 11/02/2007
by Maxim SmirnovMore...
Has free trial version download for Windows and Linux platform for G.722,G.723.1,G.729A,G.726,G.728. Support concurrent codec in single thread with a simple interface. The web page lists all prices for the binary codecs and the codec source code. The source code is written in C/C++ and can be used on other platforms. They also provide AEC/VAD/AGC source code.
Created: 15/11/2006
by Hunter LinMore...
The h264bitstream library provides a complete set of functions to read and write video streams conforming to the ITU H264 (MPEG4-AVC) video standard.
License: GNU Library or Lesser General Public License.
Created: 21/04/2006
by Maxim SmirnovMore...
Java source code for range coder based upon the carry-less range coder implementation by Dmitry Subbotin, using 64-bit variables for improved performance. Along with a generic range coder and decoder, it contains a byte stream order-zero model implemented as subclasses of Java's I/O streams.
Created: 03/03/2006
by Sachin GargMore...
PBZIP2 is a parallel implementation of the bzip2 block-sorting file compressor that uses pthreads and achieves near-linear speedup on SMP machines. The output of this version is fully compatible with bzip2 v1.0.2 (ie: anything compressed with pbzip2 can be decompressed with bzip2). Source code available.
Created: 03/03/2006
by Jeff GilchristMore...
PPMd is a compression library implementing efficient PPMII algorithm. It provides very high compression ratio and quite speedy. It's the fastest PPM-like compression algorithm implementation today. PPMd is used in RAR, 7-Zip, WinZip and other compression utilities. As of Feb. 2006, the latest variant is ver. J. The older versions can be downloaded from the author's page also (Russian language, but quite readable to find download links).
Created: 02/06/2002
by Mark NelsonMore...
This program implements the 16 kb/s Low-Delay CELP algorithm (ITU-T Recommendation G.728) using floating point. The input speech files should be 2-byte per sample and contain no header. The input can use the full 16 bit range. The bitstreams are written as 2-bytes per 10-bit codeword. This code was test on SUN and SGI platforms using the gcc and cc compilers.
Created: 16/02/2006
by Maxim SmirnovMore...
HomeBoy was a group of programers that created the first ISO compliant publicly available AAC encoder for Windows back in 1998. Also, they were reportedly creators of the first third party plugin for Winamp (their AAC input plugin), and the first ISO-compliant AAC decoder publicly available.
The encoder is just a compile of the original ISO reference sources, therefore quality is bad. But, interestingly, streams created by it can still be played in modern decoders.
Created: 31/01/2006
by Maxim SmirnovMore...
Real is making their client software available in an Open Source program. Download this software and you can develop your own MP3 or H.263 player! Free of royalties if you are distributing it for free - commercial products pay a royalty. The Helix DNA Client contains support in source code form for the following data types: MP3, H.263, SMIL, JPEG, GIF, PNG, RealPix, PCM, WAV.
Created: 30/10/2002
by Mark NelsonMore...
Pizza&Chili Corpus: Paolo Ferragina of the University of Pisa and Gonzalo Navarro of the University of Chile have a web site dedicated to the exploration of compressed indices. Paolo and Gonzalo have posted links to quite a few papers on full text compressed indices, which expound the notion that you can pick and choose exactly what you want to decompress. The site has collections of texts, links to people and papers, and a proposed API for testing work in the future.
Created: 22/01/2006
by Sachin GargMore...
This lab at CMU seems to be doing some interesting things with video compression. At a minimum, they have an H.263 decoder you can download.
Created: 24/09/2000
by Mark NelsonMore...
Big file heap with ISO standards files and additional files: conformance bitstreams, reference software source code, etc...
Created: 08/11/2005
by Dmitriy VatolinMore...
A bijective compressor using full size Rijndael encryption. BICOM is a freely available open source compressor. It uses a souped-up PPM algorithm, and is completely bijective.
Created: 01/11/2002
by Mark NelsonMore...
Free and simple open source DjVu viewer with following features:
# Supports Windows 98 and later
# Continuous and single page layouts
# Thumbnails
# Bookmarks
# Hyperlinks
# Text searching and copying
# Advanced printing
# Fullscreen mode
# Mouse wheel scrolling
# Export pages to bmp
# Rotate pages left/right
# Zoom to page, page width, 100% or custom zoom
# Brightness, contrast and gamma adjustment
# Display modes (Color/B&W/Foreground/Background)
# Keyboard shortcuts for scrolling and navigation
Created: 24/09/2005
by Dmitriy VatolinMore...
DjVuLibre includes a standalone viewer, a browser plug-in (for Mozilla, Firefox, Konqueror, Netscape, Galeon, and Opera), and command line tools (decoders, encoders, utilities). DjVuLibre works under Unix with X11.
Created: 24/09/2005
by Dmitriy VatolinMore...
Open source wavelet videocodec, previously developed by BBC. It uses
parametric affine motion compensation, arithmetic encoding and other
modern techniques. Surprisingly for the open source, it looks like the
project is well-documented.
Created: 23/09/2005
by Dmitriy VatolinMore...
LZPX is a fast file compressor/preprocessor. The main features of this program are: ultra fast compression and decompression speed, low memory usage and small size.
Created: 09/09/2005
by encodeMore...
Due to the overwhelming number of requests for JPEG code that works with Borland C++Builder we have put out an version of the Colosseum Builders' Image Library for C++. The latest version includes encoders and decoders for JPEG, GIF, Windows BMP, XBM, and PNG. It also includes an interface to VCL so that these image formats can be used at design-time with C++Builder. The current version now works with MSVC++.
Created: 19/08/2001
by Mark NelsonMore...
A lossless audio codec developed in Russia. TTA performs lossless compression on multichannel 8, 16 and 24-bit data of WAV audio files. Distributed under a free license with sources, there are executables for Windows, Linux and Mac OS-X PPC. There are a number of plug-ins for players, including WinAmp. See comparisons with other audiocodecs on the site.
Version 3.3 is shipped as of August 2005
Created: 12/09/2004
by Mark NelsonMore...
p7zip is a quick port of 7za.exe (command line version of 7-Zip, see www.7-zip.org) for Unix (POSIX). 7-Zip is a file archiver with the highest compression ratio. There is also a port of the LZMA Decoder from LZMA SDK 4.03 to Java (java_lzma).
Version 4.20 is shipping as of June, 2005.
Created: 04/07/2004
by Mark NelsonMore...
An experimental archiver that uses a BWT algorithm to achieve superior compression. With Zzlib, you can also use Zzip as a library (dll) in one of your program. Source code of Zzip/Zzlib is released under the GNU LGPL.
Created: 19/03/2002
by Mark NelsonMore...
Some pages that go along with the book "Managing Gigabytes", by Witten et.al. These pages are devoted to MG, "an open-source indexing and retrieval system for text, images, and textual images."
Created: 15/11/1999
by Mark NelsonMore...
The OpenJPEG library is an open-source JPEG 2000 codec written in C language. It has been developed in order to promote the use of JPEG 2000, the new still-image compression standard from the Joint Photographic Experts Group (JPEG). OpenJPEG library is released under the BSD license.
Created: 03/08/2005
by Sachin GargMore...
Those folks at AT&T have developed a compressor that can be used to squeeze individual data items in XML documents. AT&T says this is "essentially free" software. Read the license on-line to determine exactly what that means.
Also available on http://sourceforge.net/projects/xmill
Created: 19/12/1999
by Mark NelsonMore...
An article Mark Nelson wrote that describes how to take advantage of the zip classes included in Java's 1.1 JDK. This includes some very simple programs that can create, view, and extract from zip files.
Created: 01/12/1999
by Mark NelsonMore...
HawkVoice is a game oriented, multiplayer voice over network API released under the GNU Library General Public License (LGPL), with
support for Linux/Unix and Windows 9x/ME/NT/2000. It is designed to be a portable, open source code alternative to DirectPlay(R) Voice in DX8.
Created: 09/03/2001
by Mark NelsonMore...
A grey-scale wavelet compressor written in C. Includes pointers to source code and the paper presented on this work.
Created: 20/08/2000
by Mark NelsonMore...
JJ2000 is a Java implementation of a JPEG 2000 codec. The web site states that JJ2000 is under consideration to be a reference implemenation of the standard. JJ2000 is now freely available to all, and may be freely used in products that implement JPEG 2000-Part I.
The page also includes links to a white paper, presentations, and other related web pages.
Version 4.1 is the last release of the JJ2000 project, which officially terminated in September 2001.
Created: 06/09/2000
by Mark NelsonMore...
A collection of C programs that do string matching and pattern discovery. This appears to be free code by D. Gusfield, who also has a book called "Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology".
One DCL reader commented The strmat package is wonderful.
Created: 23/01/2000
by Mark NelsonMore...
GSM provides telephone quality speech at a compressed rate of 13 Kbps. Compare this to the 64 KBps required by standard u-law and A-law codes. This site gives lots of info about the GSM format, along with free source code.
Created: 21/11/1999
by Mark NelsonMore...
This archive contains source files lzss.c, lzhuf.c, and lzari.c. They have virtually no documentation, but do a good job of compression. These files were quite influential in their day, particularly in Japan.
Created: 07/11/1999
by Mark NelsonMore...
Source code to the FLAC library, command-line encoder/decoder, and player plugins. FLAC is an open-source lossless audio format and codec.
Created: 15/02/2001
by Mark NelsonMore...
The zlib home page. zlib is a free software package that implements the deflate compression algorithm popularized in PKWare's PKZIP product. zlib is designed to be patent free, and is free or restrictions.
Version 1.2.2 is shipping as of October 2004.
Created: 01/12/2003
by Mark NelsonMore...
The FFV1 video codec is a simple but efficient lossless intra only codec that was included into "ffdshow" (very useful tool). Available in source code.
From site:
ffdshow is DirectShow and VFW codec for decoding/encoding many video and audio formats, including DivX and XviD movies using libavcodec, xvid and other opensourced libraries with a rich set of postprocessing filters.
Created: 18/07/2005
by Dmitriy KulikovMore...
Loss-Less Codec for Video For Windows.
Since LCL is a Loss-Less codec, it does not have degradation of quality of image.
LCL is suitable for digital animation or the animation of 3DCGs.
LCL contains two kinds of codecs by the use.
Created: 18/07/2005
by Dmitriy KulikovMore...
CorePNG is a lossless codec based on PNG. Essentially, each frame is compressed as a PNG, so if PNG does it, this codec does too. (RGBA) This codec also has the ability to write P frames and to autodetect when it should. The P frame takes the difference of the previous frame and the current frame and encodes that as a PNG.
Created: 18/07/2005
by Dmitriy KulikovMore...
The best lossless video codec / in perfromance-quality terms.
From site:
Huffyuv is a very fast, lossless Win32 video codec. "Lossless" means that the output from the decompressor is bit-for-bit identical with the original input to the compressor. "Fast" means a compression throughput of up to 38 megabytes per second on my 416 MHz Celeron.
Created: 18/07/2005
by Dmitriy KulikovMore...
Lagarith is a lossless video codec. Lagarith offers excellent compression. Lagarith is able to operate in several colorspaces - RGB24, RGB32, RGBA, YUY2, and YV12. Also, Lagarith will never down-sample video, preventing inadvertent quality loss. For DVD video, the compression is typically only 10-30% better than Huffyuv. However, for high static scenes or highly compressible scenes, Lagarith significantly outperforms Huffyuv.
Created: 18/07/2005
by Dmitriy KulikovMore...
A freely redistributable lossless JPEG codec. Encoder, decoder, man pages, full C source, and some documentation.
Created: 13/11/1999
by Mark NelsonMore...
Not a bad site on wavelets in common, JPEG2000 and digital signal processing using wavelets. The main content is books, papers, thesises, sources. Partially in Russian, but there are a lot of English papers. There is a steganography page as well.
Created: 11/07/2005
by More...
The MPEG Software Simulation Group's coder/decoder. mpeg2vidcodec_v12 contains source, mpeg2v12 contains source plus a Win32 executable. Three files to download for the complete deal.
Created: 07/11/1999
by Mark NelsonMore...
Hdot264 is an experimental video codec project that is compliant with the latest and most efficient video compression standard. That standard has many aliases, including H.26L, JVT, MPEG-4 part 10, AVC and H.264.
Created: 19/06/2005
by Dmitriy KulikovMore...
Links to a documented implementation of a suffix sort. This may not be a compression topic per se, but suffix trees are useful for compressing data.
Created: 07/11/1999
by Mark NelsonMore...
Mark Nelson's zlib article, originally published in Dr. Dobb's Journal. The source code for the article includes an OCX that allows you to use zlib from many different languages under Win32
If you are attempting to use the zlib OCX with Visual Basic or Visual C++, please follow the links to my FAQ. The OCX that accompanies this article needed an upgrade to work with later versions of Microsoft's tools.
Created: 06/11/1999
by Mark NelsonMore...
Free, portable C code for JPEG compression is available from the Independent JPEG Group. Source code, documentation, and test files are included.
Created: 20/12/1998
by Mark NelsonMore...
An article by Mark Nelson that appeared in the September 1996 issue of Dr. Dobb's Journal. At the time it appeared, the BWT was relatively unknown among compression enthusiasts. This article includes source code that implements a simple test program that demonstrates BWT compression.
Created: 10/12/1998
by Mark NelsonMore...
Jasper is a C-language implementation of the JPEG-2000 Part-1 standard. Michael Adams seems to run the show, with help from Image Power and a small team. This page gives you access to the software, documentation, and a nice set of links. Jasper is distributed under a free license.
Version 1.701.0 is shipping as of February, 2004.
Created: 14/02/2004
by Mark NelsonMore...
This Java browswer knows how to render PDF files, which means it understands the elusive LZW compressed data format used by Adobe.
Created: 08/08/2002
by Mark NelsonMore...
codecs.tgz contains C source code for three different types of codecs: LZW, RLE, and Huffman. The archive contains source code documented in both French and English.
Created: 20/12/1998
by Mark NelsonMore...
Wouldn't it be great if there was a free MP3 player written in a portable language like Java? These guys certainly think so.
Version 1.0 was shipping in November, 2004.
DCL user comment: Cool.
Created: 11/01/2004
by Mark NelsonMore...
GRZipII - is a high-performance file compressor based on Burrows-Wheeler Transform, Schindler Transform, Move-To-Front and Weighted Frequency Counting. It uses The Block-Sorting Lossless Data Compression Algorithm, which has received considerable attention over recent years for both its simplicity and effectiveness. This implementation has compression rate of 2.234 bps on the Calgary Corpus(14 files) without preprocessing filters. There are Windows, Linux, and Dos executables along with the sources.
Created: 23/05/2005
by Ilya GrebnovMore...
This is a lossless audio compression format that has support for WinAmp and Windows Media Player. Retain perfect fidelity for your music recordings, at the cost of additional disk space.
Version 4.01 is shipping as of January, 2006.
Created: 02/05/2004
by Mark NelsonMore...
This directory contains source and executable for Charles Bloom's PPMZ encoder, as well as a paper on PPMZ and some benchmark results. There are also links to a few other pages containing PPM information.
Charles Bloom has now released the source code to PPMZ2. He says it is both cleaner and faster than the original PPMZ code.
Created: 14/04/2002
by Mark NelsonMore...
Despite the protestations, you can use LAME to create MP3 streams. It does it with the magic of the ISO demo code. LAME isn't lame, lots of people seem to like it.
Created: 15/09/2002
by Mark NelsonMore...
A command line & Windows Explorer integrated archiving program for manipulating zip files and some other formats. 7-zip features great compression ratios, support for quite a few other archive formats, and is free.
DataCompression.info user Gregg A. had this to say: 7-Zip is a great archiver that works with many of the popular archive formats, as well as a new one of its own. This is my favorite compression tool now because it is so universal and it's open-source.
Version 4.18 shipped in April, 2005.
Created: 10/12/2003
by Mark NelsonMore...
This open source project aims to create a free H.323 stack. The project was started as a reaction to the high cost of commercial implementations of audio and video compression code implementing the various components of H.323. Roger H. adds There are now several useful applications which use the library
including OpenMCU (a reliable multi person conference server)
and GnomeMeeting (a GTK/Gnome GUI client for Linux/BSD Unix.
As of Jan 2006, the G.711 and GSM audio are supported in software. The video H.261 codec is fully supported.
Created: 14/03/2004
by Mark NelsonMore...
The MPEG-4 draft standard contains reference software from many sources for encoding and decoding audio and video.
Created: 25/03/2001
by Mark NelsonMore...
Andrew Belov, from ARJ Software Russia, has announced the release of ARJ for Linux and FreeBSD. Links on this site to the OS/2 version as well, if you're in the market for that.
Created: 05/10/2001
by Mark NelsonMore...
x264 is a free library for encoding H264/AVC video streams. It is released under the terms of the GPL license. (x264 is still in early development stage)
Created: 15/05/2005
by Dmitriy VatolinMore...
AudioCoding.com's goal is to provide the community with free MPEG-4 audio codecs. Currently implemented are MPEG-2 and MPEG-4 AAC. The supported AAC profiles are HE, Main, LC, LTP and LD. It also supports all these profiles in their ER (Error Resilient) equivalent. Latest addition in the 2.0 version of the FAAD2 decoder library is the ability to decode HE AAC (High Efficiency) and PS AAC (Parametric Stereo) files.
Created: 22/05/2004
by Mark NelsonMore...
An article that explains arithmetic coding, plus a sample program that implements a limited sort of PPM.
Christable C. had these kind words: I have read some other articles, but not clearly known. When reading this article, I find that Arithmetic Coding is easy to know !
Created: 07/11/1999
by Mark NelsonMore...
The range encoder is a fast multisymbol entropy coder (similar to arithmetic coding) with GNU general public license (other licenses on request). Its compression is within 0.01% of arithmetic coding. It is based on an article dated 1979, so it is believed to be patent free. This page includes a PS format paper by G.N.N Martin describing the range encoder.
Created: 28/10/1999
by Mark NelsonMore...
Alistair Moffat has put together all the links to his source code and articles on Arithmetic Coding in one tidy place.
Created: 19/11/1999
by Mark NelsonMore...
A back-end implemenation of arithmetic coding for JPEG as defined in the standard. It is distributed as an add-on that can be used with the Independent JPEG groups library. The work of Guido Vollbeding.
Created: 09/03/2001
by Mark NelsonMore...
Version 1.1 of the lossless data compression toolkit by Nico deVries. The C sources in this toolkit include an LZW compressor, AR002 archiver, a PPM like compressor using arithmetic compression, Huffman compressor, splay tree program, and LZRW1. Quite a variety.
Created: 14/11/1999
by Mark NelsonMore...
This article describes a relatively painless way to construct suffix trees. Once you have a suffix tree constructed, it is extremely easy to search for the longest match of a given string. This makes the suffix tree a nice data structure to use in macro replacement forms of data compression.
Created: 01/12/1999
by Mark NelsonMore...
This page has links to the source code for a family of compressors written by Charles Bloom. This includes the LZP family, an LZW example, LZRW, and LZCB.
Created: 14/11/1999
by Mark NelsonMore...
Tom St Denis posted an article on comp.compression announcing free source code for streaming to and from files using the LZO engine. This link takes you to stream.c, modify it slightly to get stream.h, the corresponding header file.
Created: 22/09/2001
by Mark NelsonMore...
Will McKee wrote some Huffman code in C++. Take a look.
Update: Will reports that he has improved the documentation in this package, as well as adding a new function.
Created: 09/12/2002
by Mark NelsonMore...
A group of statistical coders from Charles Bloom. This includes several different entropy encoders, including Huffman, Adaptive Huffman, Shannon-Fano, CACM Arithmetic coding, and a Skew Coder.
Created: 06/11/1999
by Mark NelsonMore...
This page contains a paper "Improved Huffman coding using recursive splitting" that describes a program that attempts to improve on Huffman compression by manipulation of the data stream.
Created: 09/12/1999
by Mark NelsonMore...
Software implementing a complete DMC codec, plus code for a couple of different arithmetic encoders, and a linear time Huffman tree builder.
This program implements Dynamic Markov Compression (DMC) as described in
"Data Compression using Dynamic Markov Modelling",
by Gordon Cormack and Nigel Horspool
in Computer Journal 30:6 (December 1987). The Guazzo arithmetic coder is used here.
Created: 08/11/1999
by Mark NelsonMore...
Xiaolin Wu's static Huffman coding version of this program. Free of charge for research and non-commercial use. A total of 1.2 MBytes in a dozen or so files.
Created: 13/11/1999
by Mark NelsonMore...
bwtzip is an ongoing project, distributed under the GNU General Public License, to implement a Burrows-Wheeler compressor in standard, portable C++. It is research-grade in that it is highly modularized and abstracted, so that it is simple to swap out parts of the compressor without affecting anything else. This makes it easy to experiment with different algorithms at different stages of compression.
Looks like Steven T. Lavavej released a new version of bwtzip in early February, 2003. A wide variety of improvements, most of them in implementation - not visible to the end user. A description of recent changes is found here
Created: 11/12/2002
by Mark NelsonMore...
This site discusses a characteristic of some compression algorithms that the author refers to as One to One (bijective) compression. In a nutshell, this property means that for any file X, F( F'( X ) ) == X. (F is either the compressor or decompressor, and F' is its opposite number.) This is definitely not the case for most conventional compression algorithms.
This page has a some Huffman compression code that has been adapted to implement a bijective property.
Created: 17/12/1999
by Mark NelsonMore...
It's a page on Huffman and Shannon-Fano methods on the Russian compression.ru site. It contains a number of papers and sources, and the major part is in English. It's unlikely that you will have any problems with downloading, since the titles are in English.
Created: 21/03/2005
by Maxim SmirnovMore...
This is a fairly small C program that was developed on the Amiga.
Note: I'm not sure why, but this page gets a very high number of ratings, nearly all very favorable, although Kate W. did claim: Parts missing from the source code, can't build.
Created: 16/12/1999
by Mark NelsonMore...
A dissertation by Keith Howell which evaluates the suitability of Fractal Compression for spacecraft images. Keith says he is willing to supply source code upon request.
Created: 29/07/2004
by Mark NelsonMore...
This program tries to unpack the given file by application of several algorithms byte-by-byte. Result of work of the program is the set of files with the unpacked data. Many of the produced files are not correct. However, among them there can be correctly unpacked data. Correctly unpacked files have mainly significant sizes that distinguishes them from dust.
Created: 28/07/2004
by Mark NelsonMore...
Ch zlib Package is Ch interface to zlib. Ch is a C/C++ interpretive freely available from htpp://www.softintegration.com. Ch zlib Package alllows zlib applications with compression and decompression function runs in Ch across platform without compilation. Ch zlib Package includes the source code for building the binding to zlib.
Created: 28/07/2004
by Mark NelsonMore...
A compressor built with the world-beating PAQAR 3.0 compressor. axPAQ wraps a GUI around the engine, and includes complete source.
Created: 16/07/2004
by Mark NelsonMore...
Reputedly an excellent set of resources for using libpng - but don't take my word for it - my Japanese is non-existent.
Created: 10/07/2004
by Mark NelsonMore...
Use the zipdiff tool when you need to compare the contents of two zip files. It is equally suited for comparing jar files, EAR files, WAR files or RAR files. Run it standalone or as an Ant task. The tool supports three output formats: plain text, XML, and HTML. zipdiff is written in Java.
Release 0.4 is shipping as of June, 2004.
Created: 04/07/2004
by Mark NelsonMore...
An article on the CodeProject detailing a user's troubles with GDI+. In particular, he found that he was unable to load JPG or PNG files stored as resources with GDI+. This article presents a way to make it happen.
This article was updated June 17, 204.
Created: 27/06/2004
by Mark NelsonMore...
This project is an attempt to port the free Speex voice codec to a pure Java implementation.
Version 0.9.4 is shipping as of June, 2004.
Created: 27/06/2004
by Mark NelsonMore...
We have developed the CMJBitset class as a plug-in replacement for bitset. The CMJBitset classm depending on compilation optionsm may take as little as 7 bytes to represent a bitset of any size, assuming all the bits are set or reset. In comparision, a 1024 bitset will take 128 bytes. In essence, the CMJBitset operates by run length encoding a bitset if the bitset is either almost all set/reset, but otherwise uses the STL bitset class.
Created: 27/06/2004
by Mark NelsonMore...
This source code shows how to add zip/unzip functionality to your programs. Claim to fame: simplicity and clean packaging.
Created: 24/06/2004
by Mark NelsonMore...
Markus Kuhn has this to say about it: I wrote the freely available JBIG-KIT 1.2 portable ANSI C library, which implements a highly effective lossless bi-level image compression algorithm based on context sensitive arithmetic coding. The JBIG algorithm (specified in ITU-T Recommendation T.82), which is implemented in this library, is especially suitable for compressing scanned documents and fax pages. You can also download the (unfortunately German) project report (Studienarbeit) that I wrote about JBIG-KIT (abstract). Release 1.6 available as of 6/2004.
Created: 20/06/2004
by Mark NelsonMore...
jFLAC is a port of the Free Lossless Audio Codec (FLAC) library to Java. This library allows java developers to experiment and write programs that use the FLAC algorithms.
Version 1.2 is shipping as of July, 2005.
Created: 20/06/2004
by Mark NelsonMore...
An open source project that performs PPM compression on XML files. The advance knowledge of XML format helps give this algorithm somewhat better compressions ratios on XML data than universal compressors.
Version 0.98.1 was shipping as of June, 2004.
Created: 13/06/2004
by Mark NelsonMore...
This article shows how to decode images with IImgCtx interface provided by IE. In addition to the image types decoded with the IPictur einterface, IImgCtx also decodes TIFF and PNG.
Created: 06/06/2004
by Mark NelsonMore...
An updated and translated version of our German paper "Proseminar Datenkompression - Arithmetische Kodierung" from 2001. To the best of our knowledge, it is the first comprehensive paper that describes the whole way from the basic principles of AC up to a simple implementation, fully documented with C++ source code.
Created: 06/06/2004
by Mark NelsonMore...
This version of file encoder and decoder program is based on the Huffman coding method. It explicitly demonstrates the details of the files during the encoding and decoding. The algorithm is encapsulated in a class En_Decode in standard C++.
Created: 06/06/2004
by Mark NelsonMore...
PDF documents are commonly used and their content is usually compressed. This article shows a simple C code that can be used to extract plain text from the PDF file.
Created: 31/05/2004
by Mark NelsonMore...
This project is an attempt to hoist zlib out of the C world and into pure Java land. This allows Java developers to take advantage of a few zlib features that aren't available in the standard JDK packages. LGPL license.
Version 1.0.5 of Jzlib was shipping as of May, 2004.
Created: 22/05/2004
by Mark NelsonMore...
Libarchive is a programming library that can create and read several different streaming archive formats, including most popular tar variants and the POSIX cpio format. The bsdtar program is an implementation of tar(1) that is built on top of libarchive. It started as a test harness, but is quickly moving toward becoming a candidate system tar for FreeBSD
Created: 22/05/2004
by Mark NelsonMore...
Advanced Image Coding (AIC) is an experimental still image compression system
that combines algorithms from the H.264 and JPEG standards. More specifically,
it combines intra frame block prediction from H.264 with a JPEG-style discrete
cosine transform, followed by context adaptive binary arithmetic coding as used
in H.264. The result is a compression scheme that performs much better than JPEG
and close to JPEG-2000.
Created: 15/05/2004
by Mark NelsonMore...
WavPack allows you to losslessly compress (and restore) both 16 and 24-bit audio files in the .WAV format. Unlike "lossy" compression schemes (like MP3) that discard information, WavPack converts the audio data into a more compact form so that the restored files are digitally identical to the original source. It's somewhat like the file compression portion of WinZIP except that it's optimized for audio data. Like other lossless compression schemes the data reduction varies with the source, but it is generally between 25% and 50% for typical popular music and somewhat better than that for classical music and other sources with greater dynamic range.
Created: 15/05/2004
by Mark NelsonMore...
The Basic Compression Library is a set of open source implementations of several well known lossless compression algorithms, including RLE, Huffman, LZ77 and Rice, written in portable ANSI C. The library has been created to be flexible and easy to understand. It is well documented, and easy to use and adapt to specific situations, such as custom compression methods and embedded systems.
Satisfied user Todd W said: I needed a simple set of compression routines for use in an embedded system. I must be able to store a fair amount of information in a small EEPROM as a generic database. The Huffman coder works very well in the application and has met my needs exactly! Very nice!
Created: 14/05/2004
by Mark NelsonMore...
Microsoft's implemention of J# includes the standard Java zip classes, and CodeProject contributor Valeri has figured out how to use them from a C# program.
Created: 10/05/2004
by Mark NelsonMore...
Tom has posted his source code for embedded BWT compression. Basically, he's trying to pull it off with low amounts of RAM.
Created: 04/05/2004
by Mark NelsonMore...
Netpbm is a C package of routines for conversion, rendering, and
manipulation of graphics files. The program understands a wide array of image formats, and best of all, is completely free.
The 10.22 release shipped in May of 2004.
Created: 02/05/2004
by Mark NelsonMore...
GPAC is an implementation of the MPEG-4 Systems standard (ISO/IEC 14496-1) developed from scratch in ANSI C.
The main development goal is to provide a clean (a.k.a. readable by as many people as possible), small and flexible alternative to the MPEG-4 Systems reference software. The MPEG-4 Reference software is indeed a very large piece of software, designed to verify the standard rather than provide a small, production-stable software.
GPAC is written in ANSI C for portability reasons (embedded platforms and DSPs) with a simple goal: keep the memory footprint as low as possible.
The project will at term provide a 2D/3D core player, complete MPEG-4 Systems encoders and publishing tools for content distribution.
Version 0.1.4 is shipping as of May, 2004.
Created: 02/05/2004
by Mark NelsonMore...
The third in Michael's collection of pages explaining lossless compression algorithms. A nice tutorial accompanied by ANSI C source.
Created: 02/05/2004
by Mark NelsonMore...
Aliaksei Sanko makes a few improvements to the code in the original 1987 CACM article. His sample includes a templated producer and consumer.
Created: 01/05/2004
by Mark NelsonMore...
The Speex project aims to build a patent-free, Open Source/Free Software voice codec. Unlike other codecs like MP3 and Ogg Vorbis, Speex is designed to compress voice at low bitrates in the 8-32 kbps/channel range. Possible applications include VoIP, internet audio streaming, archiving of speech data (e.g. voice mail), and audio books. In some sense, it is meant to be complementary to the Ogg Vorbis codec.
Speex 1.1.5 was released in April, 2004.
Created: 25/04/2004
by Mark NelsonMore...
C# implementation of adaptive Huffman coding. Implements both the FGK and Vitter algorithm variations. Compression provided through two public classes, AdaptiveHuffmanProvider and AdaptiveHuffmanStream. Good compression ratios for text-based data
Created: 25/04/2004
by Mark NelsonMore...
Archive Explorer is a pure VB program that is capable of showing the contents of different archives and some archives can be extracted. Contents can be shown are: ZIP,GZ,TGZ,TAR,ARC,ARJ,RAR,CAB,LZH,LHA. Contents can be extracted are: ZIP,GZ,TGZ,TAR.
Created: 25/04/2004
by Mark NelsonMore...
In this article I describe a translation of most of the WMF SDK interfaces, data structures, constants, functions into C#. NOTE: Digital Rights Management (DRM) support is not included in this translation.
Created: 25/04/2004
by Mark NelsonMore...
This is an open-source package of DjVu programs and libraries, including encoders, viewers, browser plugins, and various utlities. The DjVu standard for document encoding was once an ATT research project, but now has been commercialized by LizardTech. This project is an attempt to popularize and evangelize the DjVu technology, with at least the benign awareness of LizardTech.
Release 3.5.15 shipped in July of 2005.
Created: 25/04/2004
by Mark NelsonMore...
Arkadi Kagan has created a C++ project that implements a batch of our favorite lossless algorithms, including LZ77, LZ78, LZW, RLE, along with arithmetic and Huffman coding.
Version 1.1 shipped in April, 2004.
Created: 19/04/2004
by Mark NelsonMore...
This code shows how to use the freeware InfoZip Zip32.DLL and Unzip32.DLL files from the http://www.cdrom.com/pub/infozip/ website. The InfoZip DLL's are open-source DLL's that are available for programmers to utilise free of charge. They are standard C DLLs and were very tricky/impossible to interface with VB until VB5/6 offered the 'addressof' operator.
Created: 20/03/2004
by Mark NelsonMore...
This page has the official releases of ARJ, UNARJ, JAR, and so on. This includes the free source to UNARJ and the ARJ32 command line archiver.
ARJ32 version 3.11 was shipping as of 12/2003.
ARJ version 2.82 was shipping as of 12/2003.
Created: 15/12/2003
by Mark NelsonMore...
Star Encoding performs some preprocessing on text files, enabling standard compressors to do somewhat better on the files. This article explains the transform and provides some sample code.
Created: 06/12/2003
by Mark NelsonMore...
Michael Dipperstein describes his personal quest for understanding and implementation of LZSS coding. Full source included.
Created: 03/12/2003
by Mark NelsonMore...
libhuffman is a Huffman encoder/decoder library and a command line interface to the library. The encoder is a 2 pass encoder. The first pass generates a huffman tree and the second pass encodes the file. The decoder is one pass and uses a huffman code table at the beginning of the compressed file to decode the file.
Beta 3 shipped in October, 2003.
Created: 27/10/2003
by Mark NelsonMore...
The latest in the series of multi-model compressors from Matt Mahoney. This improves on PAQ3n's remarkable Calgary corpus performance by an additional 12K, at some expense in speed. Takes a whopping 84MB at runtime!
Created: 26/10/2003
by Mark NelsonMore...
Michael Dipperstein describes his personal quest for understanding an implementation of Huffman coding. Full source included.
The page was updated with new source December, 2002.
Created: 18/10/2003
by Mark NelsonMore...
Matt Mahoney says that with recent improvements by Serge Osnach, PAQ3N does better on the Calgary Corpus than any other open source compressor.
Created: 18/10/2003
by Mark NelsonMore...
The source code to accompany Eduardo Enrique Gonzalez Rodriguez's artice on RF coding, his proposed new entropy encoder. (Note that this archive contains his paper as well, so you don't need to download both.)
Created: 05/10/2003
by Mark NelsonMore...
The well-respected ARJ archiver is now available as Open Source software. This portable version targets DOS, OS/2, Linux, and FreeBSD. See the project home page for binary versions of previous commercial/shareware versions. (These versions were free for personal use, but not commercial.)
ARJ 2.78/3.10 build 17 is shipping in September, 2003.
Created: 14/09/2003
by Mark NelsonMore...
This code project piece uses the octree algorithm to reduce the number of colors in a bitmap. Like most of the code project stuff, it is implemented as a class compatible with Visual C++ and MFC.
Last update of this article was September, 2003.
Created: 04/09/2003
by Mark NelsonMore...
A CodeProject article which includes working code. Compression here consists of removal of comments, extra line feeds, white space, etc.
Created: 04/08/2003
by Mark NelsonMore...
This article is an addition to an earlier CodeProject posting that was designed to make your life easier when working with Zip files. The previous article had support for extraction and navigation of Zip files. This article adds support for creation of Zip files.
Created: 21/07/2003
by Mark NelsonMore...
An article that gets into the details you need to know about in order to use the MMX instruction set found on Intel processors. The author uses image processing as a demo app.
Created: 15/07/2003
by Mark NelsonMore...
This CodeProject article gives you support for both zipping and unzipping files from archives without requiring a lib or dll. The code is absolutely free.
Update posted June 20, 2003.
Created: 21/06/2003
by Mark NelsonMore...
This CodeProject article describes the development of a PATRICIA trie in the .NET framework. The actual code is written in C#, but naturally, it can be used with any of the .NET languages.
Created: 30/05/2003
by Mark NelsonMore...
This MSDN article describes how to use the java.utl.zip package with a C# program. Includes a sample GUI Zip application.
Created: 17/05/2003
by Mark NelsonMore...
This is the Assembler code used to compress or decompress some data by modified RLE algorithm. The ZIP file contains ASM/OBJ files + a PAS unit which represents packing functions in easy to use manner (can compress files, streams etc) and can be used in Delphi applications.
Created: 06/05/2003
by Mark NelsonMore...
The guys at iMatix had the idea that they could write a super-library of C functions that woud be so useful it would rule the world. As far as I can tell, it didn't catch on. However, there are a few compression functions here that some folks might find interesting.
Created: 04/05/2003
by Mark NelsonMore...
This page has source code for a couple of different VQ compression programs from the University of Washington's EE Data Compression Labs.
Created: 04/05/2003
by Mark NelsonMore...
John Kieffer from the University of Minnesota has posted a nice library of Matlab code to be used for data compression.
Created: 01/05/2003
by Mark NelsonMore...
ABC is a free data compression program based on the Burrows-Wheeler transformation. The source code is free for academic, research and educational use as depicted in the Abel Public License (APL). The program is developed in DELPHI as a command line program just like GZIP.
Update: Jurgen has released the source code for ABC at long last! The Delphi source is available for download from the web site and can be used under his own APL.
Created: 25/04/2003
by Mark NelsonMore...
This CodeProject article presents an archiver that moves files in and out of an archive, and will extract from resources as well. It doesn't support the standard Zip format, and in a blinding flash of frankness, the author says The code is crap but it works and I couldn't find it done anywhere else.
Created: 21/04/2003
by Mark NelsonMore...
David posted this C code with the following comment: I have no idea if this is useful to folks, but since I had to beat my head against the silly Microsoft APIs for quite some time to get a useable result, I thought it might be helpful to post this little snippet showing you how to find an ACM decoder for MP3s, intiialize it, and use it to decode streaming MP3 buffers.
Reader Simon commented: Need to do some type casting when compiling with Visual C++ 6.0 as well as linking with msacm32.lib..
Created: 22/03/2003
by Mark NelsonMore...
Aleks Jakulin has created a tutorial that will walk you through the necessary steps to compress images using a wavelet transform. The steps in the process are illustrated using Mathematica code. This page goes beyond a basic tutorial in that it shows a proposal to improve image rendering by adding noise.
Created: 21/03/2003
by Mark NelsonMore...
Jurgen Abel has done an enormous amount of research on the Burrows-Wheeler Transform, and has published the results on his web site. On this page you will find:
A summary of this compression technique.
Links to over 70 online papers.
Links to at least that many people involved in BWT research or development.
Extensive links to BWT source code.
This web page may now be the definitive source of information for this field.
Created: 18/03/2003
by Mark NelsonMore...
Somebody posting as thewhizkid complained to the comp.compression newsgroup that he just couldn't figure out how to do a CRC calculation. He got a couple of good pointers here, including one to the zlib code, and another including a bit of Java that could do the job. I'll add a pointer to my ancient-but-still-cogent article from DDJ,
File Verification Using CRC to the mix. Between the three choices, I hope the poster was able to get a handle on the mysterious CRC.
Created: 12/03/2003
by Mark NelsonMore...
A recent post to comp.compression had a pointer to this page, identifying it as a source of H.263 software. Sure enough, if you scroll down to the bottom of the page you'll find links to an H.263 decoder, plus a Windows H.263 player. Not to mention some Wavelet code from JP, and a few other interesting links.
Created: 25/02/2003
by Mark NelsonMore...
This project provides an implementation of the java.util.zip classes. The code is pure java (no native code is used), and aims to be compatible with existing java.util.zip implementations.
Version 0.06 of jazzlib shipped in January of 2003. This product is still under development, but I don't have a clear roadmap, no idea when to expect a 1.0 release.
Created: 15/02/2003
by Mark NelsonMore...
David describes his work creating a bijective LZW compressor. (See this and other pages of David for details on what he means by bijective.) The page includes C++ source.
Created: 27/12/2002
by Mark NelsonMore...
Bob Carpenter has created a nice Java package that implements a PPM/arithmetic coding compression system. This page includes links to the source code, javadocs, and a fair amount of tutorial material. Very complete!
Created: 11/12/2002
by Mark NelsonMore...
Will McKee has released this as freeware - includes complete source to a string substitution compressor. From the description it sounds as though it's variant on LZSS, but I'll defer to anyone willing to do a real analysis.
Created: 09/12/2002
by Mark NelsonMore...
David Scott presents an implementation of Vitter's dynamic Huffman compressor, adapted so that it is bijective. Don't know what bijective means? Check out David's home page for more details.
Created: 09/12/2002
by Mark NelsonMore...
This is a preliminary shot at creating an open source BWT compression engine. Things look very preliminary at this point with just a couple of files available for download and not much message traffic.
Created: 09/12/2002
by Mark NelsonMore...
This dynamic Huffman coder from Karl Malbrain is written in C and includes weight scaling. It is modeled on the Vitter algorithm.
A DataCompression.info user notes that this site has been undergoing continual changes, and perhaps would benefit from some sort of "last modified on" field.
Created: 20/11/2002
by Mark NelsonMore...
SEQUITUR is a method for inferring compositional hierarchies from strings. It detects repetition and factors it out of the string by forming rules in a grammar. The rules can be composed of non-terminals, giving rise to a hierarchy. It is useful for recognizing lexical structure in strings, and excels at very long sequences.
Created: 09/11/2002
by Mark NelsonMore...
The abstract for a paper on calculation of Huffman codes. The paper isn't here, but the source code is. Alistair says that if you sort your array of counts, you can create the Canonical Huffman tree in memory.
Created: 31/10/2002
by Mark NelsonMore...
The FFmpeg project consists of two main parts: FFmpeg, which encodes and decodes the multimedia streams, and FFserver, which provides streams via HTTP for various multimedia clients. FFMpeg is completely portable since it does not rely on proprietary DLLs. The library libavcodec, which contains all the ffmpeg codecs, can be reused in any program licensed under the GNU General Public License.
Version 0.4.8 is shipping in September, 2003. Tons of new stuff in 0.4.7, a bit more in 0.4.8.
Created: 30/10/2002
by Mark NelsonMore...
Karl has created a complete BWT package, and has posted the source on this site. He also has an adaptation of N. Jesper Larsson's
Burrows-Wheeler Suffix Sorting for your perusal.
Created: 30/10/2002
by Mark NelsonMore...
Falk Hueffner created a dictionary program that was to be used for a Scrabble-type word game. The source can be found at this link under the name dawg.tar.gz.
Created: 30/10/2002
by Mark NelsonMore...
A C++ implementation of the LZSS / LZ77 algorithm. Also contains a description of the LZSS algorithm and my implementations of it as I learned more about it (hashing, lazy evaluation, etc.) All the code from my first attempt to the current version is included.
An anonymous visitor to Jonathan's page said it was "Pertinent, very useful, relevant, just what I needed."
Created: 30/10/2002
by Mark NelsonMore...
jLHA is a Java library that supports reading and writing of LHA archives. It attempts to use the same interface as the java.util.zip package. It looks like there was a burst of activity in the spring of 2002, not much activity since them.
Created: 07/10/2002
by Mark NelsonMore...
Some simple BASIC routines to compress data without extravagant data or code space. The author seems to indicate this isn't Huffman coding, but doesn't say what it is.
Created: 28/09/2002
by Mark NelsonMore...
A C implementation of Shapiro's EZW algorithm. Performance is close/better than the reported results with the wavelet filters.
Created: 03/09/2002
by Mark NelsonMore...
Ian Kaplan's Wavelet and Signal Processing page has lots of articles and C++/Java source code implementing wavelet transform via the lifting scheme, the integer-to-integer wavelet transform and the best basis wavelet packet transform
Created: 03/09/2002
by Mark NelsonMore...
A file archive utility written in VB. Compression and decompression routines are LZSS. Full source code included.
Created: 03/09/2002
by Mark NelsonMore...
This is an open source PDF renderer, which includes code that decompresses LZW data embedded in the PDF file. It doesn't actually do the LZW decompression itself - it converts the data to a format that can be handled by UNIX compress.
Created: 08/08/2002
by Mark NelsonMore...
An article by Stephane Rodriguez on the CodeProject web site. Stephane describes his article as Revealing XML best practices with a utility tool and a framework. Yes, that means there is some sample code here as well.
Created: 21/07/2002
by Mark NelsonMore...
Eduard has a Java implemention of a Daubechie-4 Lifting Wavelet Transform followed by an EZW encoder with an Adaptive Huffman Encoder output stage. You can download the Java source from this page and give it a go.
Created: 20/07/2002
by Mark NelsonMore...
Pegasus has a patented range coder that they license at no charge. This archive contains some C code that provides a sample implementation.
Created: 12/07/2002
by Mark NelsonMore...
The URARFileLib is a small library that allows you to read files from RAR archives created with RAR and WinRAR. Decompression and decryption with full RAR v2.0 compatibility is done directly in your application, thus there is no need for a DLL or any other external file. This file library is based on the free unRAR source code by Eugene Roshal, and designed for easy but powerful usage in demos and intros. This library is also useful if you want to port your demos since the URARFileLib supports multiple operating systems (Linux, SunOS, and Win32).
Update: As of June, 2002, The library is hosted at a new URL, has a new dual license, and improved samples for Win32, Linux, and UNIX.
Update: project is inactive and does not support RAR v3 archives
Created: 19/06/2002
by Mark NelsonMore...
A good resource for anyone using compression or writing compression code for the Amiga platform. Includes tutorial and background information as well as links to code.
Created: 06/06/2002
by Mark NelsonMore...
Vcodex is a software collection for transforming data. Examples of data transformers include methods for compression/decompression, data differencing, encryption, etc. The source code distribution includes an implementation of Vcdiff using the framework.
Created: 02/06/2002
by Mark NelsonMore...
DevX published this nice piece describing bit oriented I/O in Java. This might not be directly related to Data Compression, but it is a requirement in nearly all data compression software. This site may require registration, I'm not sure.
Created: 29/05/2002
by Mark NelsonMore...
The page says:
The Almacom JPEG-2000 library was written in an effort to produce the cleanest and simplest implementation possible of the JPEG-2000 standard. We have put a particular emphasis on good architecture design and code simplicity, while at the same time providing an implementation as complete and efficient as possible.
DataCompression.info user Luca M. said I was looking for a good library of wavelets. Now I've found it !
Created: 23/05/2002
by Mark NelsonMore...
From the archive: This archive contains a quick & dirty implementation of the IEEE Standard 1180-1990 accuracy test for inverse DCT. It is not guaranteed to be correct ... but if you find any bugs, please let me know (by email totgl@cs.cmu.edu). Since the archive was created in 1993 I don't know if you'll have any luck with those bug reports!
Created: 09/05/2002
by Mark NelsonMore...
This page presents the source code from the paper of the given name. The software at this time only supports eight-bit grayscale, but is free for research purposes.
Created: 01/05/2002
by Mark NelsonMore...
A C++ open source library for accessing zip files. This is a work in development, which as of the current beta now has support for reading and writing Zip files. Distributed under the LGPL.
Created: 01/05/2002
by Mark NelsonMore...
This page gives an introduction to Arithmetic Coding and shows how to implement it using floats or integers. There is also a proof of the efficiency of the algorithms, along with visualization and Win32 binaries. This page is in English and includes links to material in both German and English.
DataCompression.info user Juergen Abel found the site quite good: Clear description, especially the explanation of the renormalisation part, full source code.
Created: 15/04/2002
by Mark NelsonMore...
This article is really about using the priority queue containers that are part of the standard C++ library. The example program implements a Huffman Encoder using the queues, showing how they can do a fairly complex piece of work without too much coding on your part.
Created: 12/04/2002
by Mark NelsonMore...
This web site keeps links to free libraries and source code. If you like this, you might want to browse around in some of their other areas as well.
DataCompression.info user Andrew S. was not too impressed with this site: I tried one half of their links and they were all dead or directed to content not related to the topic.
Created: 07/04/2002
by Mark NelsonMore...
shcodec is order-0 32-bit canonical static huffman codec. It encodes an alphabet of 256 symbols with minimum-redundancy or length-restricted codes (basic method: Alistair Moffat and Jyrki Katajainen, modified by Artur A. Pessoa). shcodec uses efficient method for tree packing: on text files packed tree size is approx 68 bytes, on binary files this value is about 132 bytes. Memory requirements are very small: 1280 bytes for encoding and only 574 bytes for decoding! shcodec uses extremely fast and simple SHIFT-OR method for encoding, and CANONICAL-DECODE with a cache for small codewords for decoding.
Update: Alexander has added SH-SFX to the web page - a program for creating Win32 SFXs from files compressed with shcodec.
Created: 04/04/2002
by Mark NelsonMore...
A program to compress images, plus a write up from author Doug Houghton. Doug says its simple and fast, and does pretty well on 24 bit color images. New release from Doug as of 3/2002
Created: 19/03/2002
by Mark NelsonMore...
Source code for A 2.4 Kbps MELP coder. Target is Sun OS4. Phil Frisbie did the detective work needed to determine that the MELP coder is now owned by ASPI, so if you want to use it, you need to talk to them about licensing. See them at www.aspi.com.
Created: 05/03/2002
by Mark NelsonMore...
A novel algorithm by Stefano Lonardi. It recursively replaces words in text with pointers, and boasts of good results. Source and papers regarding Off-Line can all be found here.
Created: 04/03/2002
by Mark NelsonMore...
Cheok Yan Cheng decided to write up a short tutorial on LZW compression. It is presented here, along with some source code.
Created: 25/02/2002
by Mark NelsonMore...
Source code and demo projects from Ciprian Miclaus. Ciprian said he created this because the only other available port for CE did not include source.
DCL reader Mike P. said Very good ... especially that I found on the same site a port of libbzip2 for WinCE. Excellent ... exactly what my project needed.
Created: 14/01/2002
by Mark NelsonMore...
A version of libzip2 in source format for WinCE, along with demo code and project files. Ciprian Miclaus created this port along with one of zlib, and has made them available for all manking. Thanks Ciprian!
Created: 14/01/2002
by Mark NelsonMore...
MP3' Tech calls this "the biggest MPEG audio source codes area avaible on the Internet." Find source for MPEG-1/2/2.5 Layer 1/2/3, MPEG-2 AAC and MPEG-4, as well as UI code.
Reader Robert S. said It seems very hard to find a description of the Layer 3 bitstream format. Fimally found it here.
Created: 12/01/2002
by Mark NelsonMore...
A GPL product, with the following description from the site:
mpeg2dec is an mpeg-1 and mpeg-2 video decoder. It is purposely kept simple : it does not include features like reading files from a DVD, output picture scaling, audio decoding, synchronisation, etc... The main purpose of mpeg2dec is to have a simple test bed for libmpeg2. mpeg2dec also includes a demultiplexer for mpeg-1 and mpeg-2 program streams, and output routines for a variety of different interfaces.
Created: 01/01/2002
by Mark NelsonMore...
This source code appears to be Samuel Smith's original Unzip code. This eventually gave birth to the Info-ZIP project and its UnZip program. Sam was the pioneer, and I believe he did most or all of it on his own.
Created: 24/12/2001
by Mark NelsonMore...
Jim Mischel's article from Visual Developer discussing the use of Microsoft's CABINET.DLL to work with CAB files.
Created: 05/12/2001
by Mark NelsonMore...
The Boost project is designed to put very high-quality peer-revied libraries in the hands of you and me. CRC checking has nothing to do with compression per se, but it is frequently used when archiving to validate results.
Created: 15/11/2001
by Mark NelsonMore...
This paper from Jurgen Abel and Bill Teahan presents several preprocessing algorithms for textual data, which work with BWT, PPM and LZ based compression schemes. The algorithms need no external dictionary and are language independent. The average compression gain is in the range of 3 to 5 percent for the text files of the Calgary Corpus and between 2 to 9 percent for the text files of the large Canterbury Corpus.
Created: 14/11/2001
by Mark NelsonMore...
A paper by Alistair Moffat describing an improvement on Peter Fenwick's method for maintaining cumulative probability tables. The pointer on this page leads to some source implementing said table.
Created: 12/11/2001
by Mark NelsonMore...
Decoding GIF files and displaying them without help isn't particularly easy. Add animation to the task and you're looking at a ton of work. Fortunately, this MFC-compatible set of classes on the CodeProject web site does the heavy lifting for you.
Created: 08/10/2001
by Mark NelsonMore...
Straightforward code compatible with Visual C++ and MFC for decoding and displaying JPEG, GIF, BMP, and a few other types of files.
Created: 08/10/2001
by Mark NelsonMore...
These folks are hard at work on an open source video codec. VP4 appears to be a commercial effort, VP3 seems to be free.
One DataCompression.info user had this to say: Impressive effort by both ON2 and Xiph. Something has to replace MPEG with its rapidly deteriorating technnology and efficiency. This one has the potential but now needs the acceptance.
Created: 14/09/2001
by Mark NelsonMore...
Source to a zip decompressor that runs on the Commodore 64! Better yet, this decompressor has source for more than the typical deflate method - it includes the old-school PKWare algorithms: Store, Reduce, Shrink, Implode.
Created: 16/08/2001
by Mark NelsonMore...
Some software from the folks at CMU. Software appears to be free and unfettered, includes a reasonable amount of documentation, and is purported to run on a variety of platforms, including Windows.
Created: 11/08/2001
by Mark NelsonMore...
A port of Julan Seward's bzip2 program to the Mac. It's free and full source is included if you are the adventurous type.
Created: 25/07/2001
by Mark NelsonMore...
OpenUp allows you to double-click on compressed and archived files and have them open in the Workspace without having to resort to the command-line tools gnutar and gunzip. It also supports opening many other common, and uncommon formats. Appears to be free, with source available.
Created: 25/07/2001
by Mark NelsonMore...
An ftp site with various speech codecs, including G.722, GSM, G.711, G.723, G.721, CELP, and LPC. Licensing and ownership of the C source varies.
Created: 06/07/2001
by Mark NelsonMore...
This issue of the Data Compression Newsletter from Dr. Dobb's has some sample code showing how one might use Intel's JPEG library to display JPEG files under Win32.
Created: 05/07/2001
by Mark NelsonMore...
The wavelet warehouse has a large repository of wavelet filters. These are coefficient files only, suitable for use with MATLAB or other software. The page also has links to a couple of PDF-format papers, plus references to a few more not available online.
Created: 04/07/2001
by Mark NelsonMore...
The term bijective as used by Scott means that for any given given file X you are guarantted that A( B( X ) ) == B( A( X ) ) == X, where A and B are a pair of bijectively matched programs. In this particular case, A and B are a compressor and decompressor that use arithmetic coding. Includes C++ source.
Created: 01/05/2001
by Mark NelsonMore...
Another nice piece of source code from the Code Project. This is an ATL-based control for reading, writing, and manipulating Zip files.
Created: 25/04/2001
by Mark NelsonMore...
Mathtools.net describes itself as the technical computing portal for all your scientific and computing needs. Perhaps a bit ambitious! Here's their page with C++ compression links.
Created: 09/03/2001
by Mark NelsonMore...
This article from the archives of The CodeProject describes how to use the free library to perform in-memory compression and decompression.
Created: 27/11/2000
by Mark NelsonMore...
Issue #11 of the DDJ Compression Newsletter contains some sample code that uses the Zip library from The CodeProject to create a little unzip program. A very simple program that uses a free library to get a lot done.
Created: 17/11/2000
by Mark NelsonMore...
A lossless video codec for Win32. It's designed to be super-fast, allowing it to be used to capture video. Free software, full source available.
Created: 15/11/2000
by Mark NelsonMore...
Michael Schindler's page describing a sorting algorithm that was presented in a poster session at DCC '97. Links to his source code, plus links to the paper and poster in postscript format. Update: Michael has made additional source code available.
Created: 28/09/2000
by Mark NelsonMore...
This freeware program is designed to perform Internet telephony. It incorporates source for a couple of interesting speech codec, which is why it gets a link in the library.
Created: 24/09/2000
by Mark NelsonMore...
A collection of free TIFF software. This may help you decode files in TIFF format. Includes some documentation files.
Created: 20/08/2000
by Mark NelsonMore...
Descriptions of various speech codecs, include G.711, G.721, GSM, and CELP. Each codec gets a brief description plus pointers to additional material and source code.
Created: 20/08/2000
by Mark NelsonMore...
An Internet-based video conferencing system. Why is it interesting to us? It apparently includes an H.261 codec.
Created: 17/08/2000
by Mark NelsonMore...
This program by Nam Phambdo doesn't have any declaration regarding use, so please contact the author before attempting to use it.
Reader Prabhu S. pronounced this program "Nice."
Created: 29/07/2000
by Mark NelsonMore...
A couple of programs using neural networks for compression, along with a couple of papers by the author. This area of data compression is definitely underserved, check out what's here and see if neural networks deserve more attention than they are getting.
Update: This page appears to now have some links to general lossless benchmarking info.
Created: 17/07/2000
by Mark NelsonMore...
Douglas W. Jones, University of Iowa Department of Computer Science. This page contains some C source for a Splay Tree algorithm.
Created: 15/07/2000
by Mark NelsonMore...
This is a free codec that uses 3D-subband coding, progressive quantization and arithmetic coding. Based on work by David Taubman in pursuit of his PhD.
Created: 08/06/2000
by Mark NelsonMore...
This page has a good set of pointers to LHA programs and source code, including variants such as AR002, lz_comp2, and Lharc.
Created: 03/06/2000
by Mark NelsonMore...
Home of JCALG1, an LZSS derived lossless compression algorithm with full x86 32bit assembly source. Data Compressioni Library user comment: I found LZSS C source and an EXE. The EXE was useful for testing. I expect to use this in an embedded app after further research..
Created: 16/05/2000
by Mark NelsonMore...
Source code from Random Access Data Compression, by Philip Gage (see gage.zip) from the Septeber 1997 C User's Journal
Created: 04/05/2000
by Mark NelsonMore...
A short article with some code that illustrates how one would unzip archives using built-in java classes. Registration required to view the page.
Created: 12/04/2000
by Mark NelsonMore...
Advertises itself as "A dirty but free implementation of a huffman encoder/decoder in Java." Not completely free, it is covered by the GLPL, and naturally includes fully documented source.
Created: 23/03/2000
by Mark NelsonMore...
An article from the ACM Crossroads student publication. This article shows how to use some of the built-in classes in the JDK to create and access compressed files and streams.
Created: 27/02/2000
by Mark NelsonMore...
Laurent Balmelli from the neutral country of Switzerland has some C++ code to do some interesting quadtree processing. He would like you register if you are going to use the code, but this is not mandatory. This code is being used by over 100 registered users, and has at least been accessed by as many as 2500 others.
Created: 24/02/2000
by Mark NelsonMore...
Mary Holly Johnson wrote a couple of papers on something called ECVQ, which I'm guessing is a type of vector quantization. This ftp directory has a bunch of C code that probably implements the code from one or more of her papers.
Created: 21/02/2000
by Mark NelsonMore...
NuLib is a program for the Apple II which manipulates NuFx archives. The page is also the distribution site for NuLib2, and improved version of the program, and NufxLib, a programming library.
Created: 11/02/2000
by Mark NelsonMore...
James E. Fowler at Mississippi State University has created this library, which is an open source collection of routines that are useful for people interested in data compression research. The distribution includes QccSPIHT.
Version 0.45 is available as of December, 2003.
Created: 23/01/2000
by Mark NelsonMore...
Charles Bloom works up a quick coder that will compress random bytes down to about 6 bits per byte. Quite a feat, but it doesn't stand up to close scrutiny.
Created: 06/01/2000
by Mark NelsonMore...
EPIC is a lossy image compression program. It uses subband decomposition followed by an entropy encoder to get its job done. This page has links to the code, papers, references, and more.
Created: 01/01/2000
by Mark NelsonMore...
The documentation for this program is in an unknown language. It appears to be an archiver that includes a developers kit.
Created: 01/01/2000
by Mark NelsonMore...
A short but sweet wrapper that lets you stream input and output using zlib's deflate engine.
Note that in order to get this code to work with gcc, you might have to add the following lines of code:
int zapeof( char c ) { return 1; }
int zapeof( int c ) { return 1; }
This codec appears to use techniques which are compatible with H.263 and MPEG-2, although it is not compatible with those standards. The Matching Pursuit algorithm is used in place of DCT after motion compensation.
Created: 22/12/1999
by Mark NelsonMore...
A library to perform adpative Huffman coding as described by Knuth in J. Alg. Nice clean looking C source code.
This link continues to be one of the most popular links at DataCompression.info. Reader Karl M. had this comment about the page: The program has a few problems converting from one-based to zero-based arrays. The code for incorporating the last symbol grabs an extra input bit, but since this is usually the EOT symbol, the bug doesn't always cause problems..
Created: 16/12/1999
by Mark NelsonMore...
A really nice set of programs and source code for all sorts of data compression. This area doesn't appear to be actively maintained, so there are plenty of out-of-date files, but good stuff mixed in as well.
Created: 16/12/1999
by Mark NelsonMore...
An Optimizing Hybrid LZ77 RLE Data Compression Program, aka Improving Compression Ratio for Low-Resource Decompression by Pasi Ojala.
Presents a new literal tagging system, a fast exhaustive string
match algorithm, an optimal parsing algorithm, and results on
Calgary Corpus and Canterbury Corpus.
Created: 13/12/1999
by Mark NelsonMore...
A freeware archiver from Sweden. Versions available for DOS, Windows, and Linux, with Win9x long filename support. Includes source for extraction from ASD archives.
DCL user feedback: Finally i found what I was looking for, a freeware archiver for both Windows, Dos, and Linux! Another user said: Offers What I waslooking for, very helpful.
Created: 28/11/1999
by Mark NelsonMore...
Includes editing software, file formats and converters, codecs, and links. This page is at the University of Hannover, Germany. While there appears to be a wealth of software on this site, it would appear that you need special privileges to get access to ti.
Created: 26/11/1999
by Mark NelsonMore...
DCG Framework is a object oriented framework for lossless data compression. It is written in C++, and intends to be a didactic framework for data compression teaching. This framework is pointed to by the Seccao de Analise de Sinais page.
Created: 26/11/1999
by Mark NelsonMore...
A page with a brief description of LZW compression by Dominik Szopa. This page includes a Java applet that helps show how LZW looks in action.
Created: 22/11/1999
by Mark NelsonMore...
The Berkeley MPEG player for X11 (look for mpeg_play-2.4 or later; source and binary distributions are available)
Created: 14/11/1999
by Mark NelsonMore...
The dissertation itself, in PS format, along with code used in the dissertation. The code implements k-ary arithmetic compression.
Created: 07/11/1999
by Mark NelsonMore...
A set of links to Cormack's publications. Papers germane to this page include one on DMC and arithmetic compression. Pointers to many other data compression articles which are unfortunately not linked to this page.
Created: 05/11/1999
by Mark NelsonMore...
Tiny HAP is based on Harald Feldman's HAP & PAH 3.0, and is distributed freely. The source code is not identical and should run 25% faster.
Created: 04/11/1999
by Mark NelsonMore...
DATA FORMATS AND COMPRESSION ALGORITHMS USING WAVELET PACKETS Version 1.4, 24 May 1990. Mladen Victor Wickerhauser, Numerical Algorithms Research Group, Department of Mathematics,Yale University,New Haven, CT 06520. Includes some documentation.
Created: 02/11/1999
by Mark NelsonMore...
Multiply-free Arithmetic Coding Implementation - Compression version 0.0.0
Gordon V. Cormack Feb. 1993
University of Waterloo cormack@uwaterloo.ca
This source code has some traditional freeware licensign terms embedded in the code
Created: 02/11/1999
by Mark NelsonMore...
An in-place Huffman code length calculation demonstration from Alistair Moffat. The abstract of that paper may be fetched from this location.
Created: 28/10/1999
by Mark NelsonMore...
A set of Markov compressors by Charles Bloom, including source code. This includes links to a wide variety of his programs, including Context Coders, List LRU, and DefSum, along with a link to an early paper of his.
Created: 14/02/1999
by Mark NelsonMore...
This PPM implementation has a complete batch of source code but no external documentation. I believe that all of the code is documented internally in Portugese. Corrections to this theory are welcome.
Created: 04/01/1999
by Mark NelsonMore...
The source code to the famous Witten, Neal, and Cleary 1987 CACM article on arithmetic coding. The paper is probably not legally on line anywhere, but can be found in the book Text Compression, as well as the journal. This FTP site has three different variations on the source.
Created: 08/12/1998
by Mark NelsonMore...
Contains his implementation of the BWT algorithm, in the program bred. Along with this are some notes and papers on his implementation of BWT
Current list of files:
bexp.c
bexp3.c
bred.c
bred.ps
bred.ps.Z
bred2
bred3.c
bred3.ps
exp.c
huff.ps
mintext.ps
mintext.tex
red.c
tea.ps
tub.ps
wake.ps
xtea.ps
xxtea.ps
Created: 30/11/1998
by Mark NelsonMore...
Free, portable C code for JPEG compression is available from the Independent
JPEG Group. Source code, documentation, and test files are included.
Created: 01/01/1970
by Mark NelsonMore...
An early Japanese archiver, complete with C source. This source code by Haruhiko Okumura was influential in that many subsequent archiving programs were based on the concepts it explained. No documentation to speak of.
Created: 01/01/1970
by Mark NelsonMore...
mplib is a C library that enables programms to access ID3 tags in MP3 files. ID3 tags are meta-informations such as the title, artist or comments that come with most MP3's. mplib supports ID3 version 1 and version 2 tags. It is written to be very easy to use, fast and cross-platform capable.
Created: 01/01/1970
by Mark NelsonMore...
This article on Microsoft's web site goes into many of the interesting
features included in WM9 codec, such as fold-down from multiple channels to stereo, high resolution multichannel content, and more. Highly technical.
Created: 01/01/1970
by Mark NelsonMore...
full source & exe of an MFC app that uses ZLIB.DLL
This is a quick & dirty utility that allows you to compress BMP images with gzip to obtain better compression that GIF images, and still use them as background for your desktop.
To use this program you need ZLIB.DLL and the MFC30.DLL that comes with 95.
Created: 01/01/1970
by Mark NelsonMore...