1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
huffman
=======
Generate Huffman codebooks! `Huffman codes`_ are the optimal way to compress individual symbols into a binary sequence that can be unambiguously decoded without inter-symbol separators (it is "`prefix-free`_").
Provided an iterable of 2-tuples in ``(symbol, weight)`` format, generate a Huffman codebook, returned as a dictionary in ``{symbol: code, ...}`` format.
::
>>> huffman.codebook([('A', 2), ('B', 4), ('C', 1), ('D', 1)])
{'A': '10', 'B': '0', 'C': '110', 'D': '111'}
If you have an iterable of symbols, the ``collections.Counter`` is a handy way to tally them up.
::
>>> huffman.codebook(collections.Counter('man the stand banana man').items())
{' ': '111',
'a': '10',
'b': '0100',
'd': '0110',
'e': '11010',
'h': '0101',
'm': '1100',
'n': '00',
's': '11011',
't': '0111'}
.. _Huffman codes: https://en.wikipedia.org/wiki/Huffman_coding
.. _prefix-free: https://en.wikipedia.org/wiki/Prefix_code
|