Search for a tool
Levenshtein Distance

Tool to calculate the Levenshtein distance between 2 words (character string) and search for related words in the dictionary or in a list.

Results

Levenshtein Distance -

Tag(s) : Data Processing

Share
dCode and more

dCode is free and its tools are a valuable help in games, maths, geocaching, puzzles and problems to solve every day!
A suggestion ? a feedback ? a bug ? an idea ? Write to dCode!

Please, check our dCode Discord community for help requests!
NB: for encrypted messages, test our automatic cipher identifier!

Feedback and suggestions are welcome so that dCode offers the best 'Levenshtein Distance' tool for free! Thank you!

Levenshtein Distance

Distance Measurement

Warning: Measurements are limited to letters (A-Z), diacritics and case (upper/lowercase) are ignored.

 Algorithm Levenshtein Damerau-Levenshtein

What is the Levenshtein distance? (Definition)

The Levenshtein distance is an algorithmic method allowing to quantify a distance between two words (more generally between 2 strings of characters). Two close words (that is to say that few things separate them spelling: they have several letters in common, in the same position) will then have a small distance, while two very different words will have a large distance. This distance is also called edit distance and is equal to the minimum number of characters to be deleted, inserted, or replaced to move from one string to another.

Example: TAKE and MAKE are similar graphically (a kind of near-homograph) and have a Levenshtein distance of 1

Example: CLOSE and CLOTHES are similar phonetically (homophone) but orthographically far apart, they have a Levenshtein distance of 3

The distance from Levenshtein is symmetrical, the distance value from STRING1 to STRING2 is equal to the distance value from STRING2 to STRING1

How does Levenshtein distance work?

Levenshtein's algorithm evaluates the number of differences between the two character strings, the differences can be of 3 types: a substitution (replacement of one character by another), an insertion (addition of a new character) or a deletion (deletion of a character).

$$\operatorname{Distance}(a,b) = \begin{cases} ||a|| & \text{ if } ||b|| = 0, \\ ||b|| & \text{ if } ||a|| = 0, \\ \operatorname{Distance}(a', b') & \text{ if } a[0] = b[0] \\ 1 + \min \begin{cases} \operatorname{Distance}(a', b) \\ \operatorname{Distance}(a, b') \\ \operatorname{Distance}(a', b') \\ \end{cases} & \text{ otherwise } \end{cases}$$

With $||a||$ the size of the string, and $a'$ the string $a$ deprived of its first character (noted $a[0]$)

What is the Damerau-Levenshtein distance? (Definition)

The Damerau-Levenshtein distance is similar to the Levenshtein distance but adds a transposition of 2 adjacent characters type difference (which is a common typographical error).

How to measure the distance between 2 words?

The Levenshtein distance measures the similarity between two strings of characters.

Example: DCODE is at a distance of 2 from DECODER (1- add E and 2- add R)

Example: DECODER is at a distance of 2 from DCODE (1- remove E and 2- remove R)

How to find a word close to another?

Use the dCode tool by entering a word and a dictionary with which to compare the word.

All words with a similar spelling will be returned.

What is Levenshtein distance used for?

Calculating the distance between 2 strings of characters can make it possible to know their quantity of differences and thus also their amount of similarity. So the Levenshtein calculation algorithm can be used to determine typing errors or misspellings, words for which the proximity of the compared chains is strong.

What is the algorithm for programming a Levenshtein distance?

The best known algorithm for calculating the Levenshtein distance was created by Wagner and Fischer in 1974. // Pseudo-codefunction levenshteinDistance(str1, str2) { size1 = length(str1) size2 = length(str2) matrix = [ size1 + 1 ] x [ size2 + 1 ] for i from 0 to size1 { matrix[i][0] = i } for j from 0 to size2 { matrix[0][j] = j } for i from 1 to size1 { for j from 1 to size2 { if (str1[i - 1] == str2[j - 1]) cost = 0 else cost = 1 matrix[i][j] = minimum( matrix[i - 1][j] + 1, matrix[i][j - 1] + 1, matrix[i - 1][j - 1] + cost ) } } return matrix[size1][size2]}

Source code

dCode retains ownership of the "Levenshtein Distance" source code. Except explicit open source licence (indicated Creative Commons / free), the "Levenshtein Distance" algorithm, the applet or snippet (converter, solver, encryption / decryption, encoding / decoding, ciphering / deciphering, breaker, translator), or the "Levenshtein Distance" functions (calculate, convert, solve, decrypt / encrypt, decipher / cipher, decode / encode, translate) written in any informatic language (Python, Java, PHP, C#, Javascript, Matlab, etc.) and all data download, script, or API access for "Levenshtein Distance" are not public, same for offline use on PC, mobile, tablet, iPhone or Android app!
Reminder : dCode is free to use.

Cite dCode

The copy-paste of the page "Levenshtein Distance" or any of its results, is allowed (even for commercial purposes) as long as you credit dCode!
Exporting results as a .csv or .txt file is free by clicking on the export icon
Cite as source (bibliography):
Levenshtein Distance on dCode.fr [online website], retrieved on 2024-06-24, https://www.dcode.fr/levenshtein-distance

Need Help ?

Please, check our dCode Discord community for help requests!
NB: for encrypted messages, test our automatic cipher identifier!