Tools that apply Burrows-Wheeler algorithm. Burrows-Wheeler transform (BWT) is an algorithm maximizing repeated letters in a text, which is useful in data compression.
Burrows–Wheeler Transform - dCode
Tag(s) : Compression
dCode is free and its tools are a valuable help in games, maths, geocaching, puzzles and problems to solve every day!
A suggestion ? a feedback ? a bug ? an idea ? Write to dCode!
The Burrows-Wheeler Transform (BWT) is a technique for rearranging/reorganizing characters in a text message. Mainly used in data compression, BWT is a prior processing which tends to bring identical characters together, this proximity then allows increased compression (for example by RLE coding).
The first step consists in listing all possible rotations of the message (of the character string).
Example:
DECODE |
EDECOD |
DEDECO |
ODEDEC |
CODEDE |
ECODED |
The second step is sorting this list in alphabetical order.
Example:
1 | CODEDE |
---|---|
2 | DECODE |
3 | DEDECO |
4 | ECODED |
5 | EDECOD |
6 | ODEDEC |
The ciphered message is constituted of the last letters of each rotation. The associated key is the rank of the original message in the list.
Example: The encrypted message is EEODDC. The key is 2 (DECODE, the original text, is on the line 2 if the table).
dCode accepts only ASCII characters (without line return)
Decryption/Burrows Wheeler Inverse Transformation requires to know the key and the ciphered message (with N characters).
Example: The ciphertext EODC (4 characters) and the key 1
Step A: initialize an empty array with N rows and N columns.
Step B: write the encrypted message in the last empty column of the table
Step C: sort the rows of the table in alphabetical order
Repeat steps B and C as many times as there are letters in the message.
Example: State of the table after each step:
A | B₁ | C₁ | B₂ | C₂ | B₃ | C₃ | B₄ | C₄ | ||||||||||||||||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
When finished, the plaintext is at the line number key of the table.
Example: At the row 1, after the last step of the algorithm, is the plain message: CODE
The encoded message tends to have identical sequences of letters that are repeated, which facilitates their compression (via algorithms like Run Length Encoding - RLE).
The ciphered message has a high number of repeated letters and a classic index of coincidence.
The message is sometimes overencrypted with a RLE encoding.
The key is not really important for intelligible text, because when decrypting, all lines of the table are in fact rotations of the original text.
dCode offers to calculate the most probable key automatically.
BWT can be used without a key, but in this case, a unique character of the original text and its position are needed, for instance in computer EOF character is used for last one.
Several implementations are possible but the best ones are in $ O(n) $ for the duration and $ O(n \log \sigma) $ (or even better) for the memory. With $ n $ the input size and $ \sigma $ the size of the alphabet.
Burrows-Wheeler Transform was invented in 1994 by Michael Burrows and David Wheeler
dCode retains ownership of the "Burrows–Wheeler Transform" source code. Except explicit open source licence (indicated Creative Commons / free), the "Burrows–Wheeler Transform" algorithm, the applet or snippet (converter, solver, encryption / decryption, encoding / decoding, ciphering / deciphering, breaker, translator), or the "Burrows–Wheeler Transform" functions (calculate, convert, solve, decrypt / encrypt, decipher / cipher, decode / encode, translate) written in any informatic language (Python, Java, PHP, C#, Javascript, Matlab, etc.) and all data download, script, or API access for "Burrows–Wheeler Transform" are not public, same for offline use on PC, mobile, tablet, iPhone or Android app!
Reminder : dCode is free to use.
The copy-paste of the page "Burrows–Wheeler Transform" or any of its results, is allowed (even for commercial purposes) as long as you credit dCode!
Exporting results as a .csv or .txt file is free by clicking on the export icon
Cite as source (bibliography):
Burrows–Wheeler Transform on dCode.fr [online website], retrieved on 2024-12-03,