The Prefix Property
This is the property that ensures in variable bit length schemes that no code can be confused for another. As an example of the problem, take the following code sets for a few symbols:
A - 00 B - 01 C - 001 D - 0 E - 1
Set A
A - 0 B - 10 C - 110 D - 1110 E - 11110
Set B
Now, if you try to decode the following message, encoded with Set A, you'll find that there are several possible interpretations of the message - handy for your poetic decompressor, but not very good for extracting real data.
001000101000
But, the following message encoded using Set B can only be decoded one way - there are not multiple possible messages.
11001011101111001110
The trick is that each of the codes in set B has a unique prefix - a number of bits at the start of the code are unique from the number of bits in the start of every other code. If you built a tree using the 1s and 0s as branches left and right, you'd find that only the leaves of the tree have symbols in them - this is true of any set of symbol and code pairs that maintain the prefix property. Set A, on the other hand, has a few symbols that can be confused with each other because they `fit' in one another - A and D, B and C. Such code sets are worthless for compression purposes.
