*/
Love this site? Hate it? Leave us some comments.

Other Views

corner
*/

Adaptive(Dynamic) Huffman Coding with Huffman Tree Online

                     First Page



Adaptive Huffman Encoding, also known as Dynamic Huffman Encoding, solves the above problems by creating Huffman trees on the fly. As stated earlier, in adaptive Huffman coding we need not know the symbols' frequency table in advance. What is done in adaptive Huffman coding is that initially all the symbols are allocated the frequency count of 1, and then an initial Huffman tree is formed. After a symbol is encoded we increment its frequency count, it is like having a counter associated with each of the symbols, and we rebuild the tree. This process continues until all the symbols are encoded. The process for creating a Huffman tree and assigning codes to the symbols remains the same as the one used in static Huffman encoding. The only difference between static and adaptive methods is that in the latter we dynamically create the symbol frequency table and in the former it is known in advance. Let us clear this concept by taking an example: suppose we have 4 symbols a, b, c and d which occur as “abbbccdccc". Initially all the symbols will be having a frequency count of 1 and an initial tree will be build as show in fig (a); symbol 'a' will be encoded as 01 and its frequency count will be incremented to 2. The tree is updated as in fig (b) and this time we will encode 'b' as 101. This process continues until we are finished with encoding all the symbols.

http://www.programmersheaven.com/articles/images/huffimage004.gif


As frequency count of a symbol increases, bits needed to encode the symbol decreases i.e. in adaptive Huffman coding bits needed to encode a symbol increase or decrease dynamically and hence the name Dynamic Huffman Encoding. The final bit stream will be 0110100111100001101001.

abbbccdccc
0110100111100001101001


Decoding adaptive Huffman codes is also very easy. First of all, we build an initial Huffman tree, like the one we built while encoding the symbols. From that tree we decode the first symbol. After decoding the symbol we increment its count by 1 and we update the tree. From this newly updated tree, we decode the next symbol. This process continues till we decode all the symbols.

N.B.: All the decoding trees will be same as their encoding counterparts. The following algorithm can be used to decode the adaptive Huffman codes:

Algorithm for Decoding Adaptive Huffman codes
  1. Make initial Huffman tree having frequency count of all symbols 1.
  2. Start from the root of the tree.
  3. Examine the next element in the input
  4. If it is 1, move to the left child.
  5. If it is 0, move to the right child.
  6. If it is leaf node then
    1. Output its symbol.
    2. Increment its count by 1.
    3. Update the Huffman tree.
    4. Go to step 2
  7. If it is not a leaf node then go to step 3
Suppose for example; we want to decode “0110100111100001101001". We first create the initial tree as in fig (a). From that tree we decode 01 as 'a'; procedure for decoding the code basically remains same as the one used for static Huffman decoding. We then increment the frequency count of 'a' to 2 and rebuild/ update the tree (fig. b). From this updated tree we then decode 101 as 'b' and then increment count of 'b' to 2 and again update the tree as in the fig (c). We continue this process until the whole input stream is decoded.

0110100111100001101001
abbbccdccc
So the final output stream becomes “abbbccdccc".

We must now have a look at following characteristics of Adaptive Huffman encoding:
  • While encoding the symbols, after each symbol is encoded we need to update the tree. Same is also done while decoding the codes. That means there is some processing overburden involved.
  • While compressing files with adaptive Huffman encoding we only need to have one pass on the file as compared to two required by static Huffman coding. This can save time required for additional disk I/O operation.
  • At the time of encoding the symbols, as more and more symbols are encoded, the compression ratio begins to rise as shown in the following figure, but compression ratio can not rise beyond a certain limit and the curve flattens out.
http://www.programmersheaven.com/articles/images/huffimage005.gif


Huffman codes are indeed very efficient. We use Huffman codes in virtually every application which involves compression of digital data. Huffman compression can compress data up to around 90%. Huffman compression is implemented in computer networks, modems and fax machines. Huffman compression is also used in multimedia applications. Various multimedia standards like JPEG, MP3, and MPEG use Huffman compression.

Program in C to implement Static Huffman Encoding:

#include<stdio.h>
#include<string.h>
#include<conio.h>
#include<stdlib.h>
#define MAX 10
struct link
{
            int freq;
            char ch[MAX];
            struct link* right;
            struct link* left;
};
typedef struct link node;
void sort(node *[], int);
node* create(char[], int);
void sright(node *[], int);
void Assign_Code(node*, int [], int);
void Delete_Tree(node *);
main()
{
    node* ptr, * head;
    int i, n, total = 0, u, c[15];
    char str[MAX];
    node* a[12];
    int freq;
    clrscr();
    printf(  "Huffman Algorithm\n");
    printf("\nEnter the no. of letter to be coded:");
    /*input the no. of letters*/
    scanf("%d", &n);
    for (i = 0; i < n; i++)
    {
        printf("Enter the letter & frequency:");
        /*input the letter & frequency*/
        scanf("%s %d", str, &freq);
        a[i] = create(str, freq);
    }
    while (n > 1)
    {
        sort(a, n);
        u = a[0]->freq + a[1]->freq;
        strcpy(str,a[0]->ch);
        strcat(str,a[1]->ch);
        ptr = create(str, u);
        ptr->right = a[1];
        ptr->left = a[0];
        a[0] = ptr;
        sright(a, n);
        n--;
    }
    Assign_Code(a[0], c, 0);
    getch();
    Delete_Tree(a[0]);
}

node* create(char a[], int x)
{
    node* ptr;
    ptr = (node *) malloc(sizeof(node));
    ptr->freq = x;
    strcpy( ptr->ch , a);
    ptr->right = ptr->left = NULL;
    return(ptr);
}
void sort(node* a[], int n)
{
    int i, j;
    node* temp;
    for (i = 0; i < n - 1; i++)
        for (j = i; j < n; j++)
            if (a[i]->freq > a[j]->freq)
            {
                temp = a[i];
                a[i] = a[j];
                a[j] = temp;
            }
}
void sright(node* a[], int n)
{
    int i;
    for (i = 1; i < n - 1; i++)
        a[i] = a[i + 1];
}
void Assign_Code(node* tree, int c[], int n)
{
    int i;
    if ((tree->left == NULL) && (tree->right == NULL))
    {
        printf("%s code:", tree->ch);
        for (i = 0; i < n; i++)
        {
            printf("%d", c[i]);
        }
        printf("\n");
    }
    else
    {
        c[n] = 1;
        n++;
        Assign_Code(tree->left, c, n);
              c[n - 1] = 0;
        Assign_Code(tree->right, c, n);
    }
}
void Delete_Tree(node * root)
{
    if(root!=NULL)
    {
        Delete_Tree(root->left);
        Delete_Tree(root->right);
        free(root);
    }
}


Reader's Exercise:


  • Give real life practical examples of where static and adaptive Huffman encoding techniques are used.
  • The above given program implements static Huffman encoding; it can encode the symbols and can give the Huffman codes for the same. Is there something missing? Surely there is, the above program can not decode the Huffman codes. Write a function which will decode the Huffman codes.
  • The above given program can create Huffman tree but it can not display the tree. Write a function which will display the tree graphically; like the one drawn above. You will be confronting a typical problem while displaying the tree graphically. what problems did you face?
  • Try to implement adaptive Huffman coding; use any (computer) language of your choice.
  • JPEG and MP3 compressions use Huffman encoding; now if you walkthrough the article carefully; you know that Huffman encoding is a Lossless compression technique but both JPEG and MP3 are lossy compression standards, how?
  • Displaying Huffman codes is fine but how can you store these binary codes in a file, in their original form; i.e. for example how will you store binary string "100010101010000011100", in the binary form, in a file? Write a code fragment in C for that.
Prof. David A. Huffman did not patent his excellent work on Huffman codes.


That's all folks!

I hope you enjoyed the article, do not forget to write back to me with your precious comments, suggestions etc.
Pradeep P. Chandiramani.
pradeepchandiramani@yahoo.co.in
/*PCD*/


Disclaimer:
  • I have used my best efforts to provide correct information in the article. I make no warranty of any kind, expressed and implied with regard to documentation contained in the article.
  • The program(s) in the article are provided as a supplement to algorithms used in the article. I have run program(s) on my machine and they run properly but then also I do not guaranty and take any responsibility that the program(s) will run properly on your machine. I make no warranty of any kind, expressed and implied with regard to program(s) contained in the article.
  • I shall not be liable in any event for incidental or consequential damages in connection with, or arising out of, the use of material provided.
  • The material is provided in the hope that it will be useful, but without any warranty; without even the implied warranty of fitness for a particular purpose.
Program(s) are compiled in Turbo C++ version 3.0 for DOS. You may have to make some changes in code for the programs to run on your machine.

About the Author

Copyright (C) 2004 by Pradeep P Chandiramani. No part of this article should be reproduced in any form without prior permission form the author.


                     First Page



corner
© 1996-2008 CommunityHeaven LLC. All rights reserved. Reproduction in whole or in part, in any form or medium without express written permission is prohibited.
Violators of this policy may be subject to legal action. Please read our Terms Of Use and Privacy Statement for more information.
North American business development: Nicolai Wadstrom. Publisher: Lars Hagelin.