Tool to calculate the Levenshtein distance between 2 words (character string) and search for related words in the dictionary or in a list.
Levenshtein Distance - dCode
Tag(s) : Data Processing
dCode is free and its tools are a valuable help in games, maths, geocaching, puzzles and problems to solve every day!
A suggestion ? a feedback ? a bug ? an idea ? Write to dCode!
The Levenshtein distance is an algorithmic method allowing to quantify a distance between two words (more generally between 2 strings of characters). Two close words (that is to say that few things separate them spelling: they have several letters in common, in the same position) will then have a small distance, while two very different words will have a large distance. This distance is also called edit distance and is equal to the minimum number of characters to be deleted, inserted, or replaced to move from one string to another.
Example: TAKE and MAKE are similar graphically (a kind of near-homograph) and have a Levenshtein distance of 1
Example: CLOSE and CLOTHES are similar phonetically (homophone) but orthographically far apart, they have a Levenshtein distance of 3
The distance from Levenshtein is symmetrical, the distance value from STRING1 to STRING2 is equal to the distance value from STRING2 to STRING1
Levenshtein's algorithm evaluates the number of differences between the two character strings, the differences can be of 3 types: a substitution (replacement of one character by another), an insertion (addition of a new character) or a deletion (deletion of a character).
$$ \operatorname{Distance}(a,b) = \begin{cases} ||a|| & \text{ if } ||b|| = 0, \\ ||b|| & \text{ if } ||a|| = 0, \\ \operatorname{Distance}(a', b') & \text{ if } a[0] = b[0] \\ 1 + \min \begin{cases} \operatorname{Distance}(a', b) \\ \operatorname{Distance}(a, b') \\ \operatorname{Distance}(a', b') \\ \end{cases} & \text{ otherwise } \end{cases} $$
With $ ||a|| $ the size of the string, and $ a' $ the string $ a $ deprived of its first character (noted $ a[0] $)
The Damerau-Levenshtein distance is similar to the Levenshtein distance but adds a transposition of 2 adjacent characters type difference (which is a common typographical error).
The Levenshtein distance measures the similarity between two strings of characters.
Example: DCODE is at a distance of 2 from DECODER (1- add E and 2- add R)
Example: DECODER is at a distance of 2 from DCODE (1- remove E and 2- remove R)
Use the dCode tool by entering a word and a dictionary with which to compare the word.
All words with a similar spelling will be returned.
Calculating the distance between 2 strings of characters can make it possible to know their quantity of differences and thus also their amount of similarity. So the Levenshtein calculation algorithm can be used to determine typing errors or misspellings, words for which the proximity of the compared chains is strong.
The best known algorithm for calculating the Levenshtein distance was created by Wagner and Fischer in 1974. // Pseudo-code
function levenshteinDistance(str1, str2) {
size1 = length(str1)
size2 = length(str2)
matrix = [ size1 + 1 ] x [ size2 + 1 ]
for i from 0 to size1 { matrix[i][0] = i }
for j from 0 to size2 { matrix[0][j] = j }
for i from 1 to size1 {
for j from 1 to size2 {
if (str1[i - 1] == str2[j - 1]) cost = 0
else cost = 1
matrix[i][j] = minimum( matrix[i - 1][j] + 1, matrix[i][j - 1] + 1, matrix[i - 1][j - 1] + cost )
}
}
return matrix[size1][size2]
}
dCode retains ownership of the "Levenshtein Distance" source code. Except explicit open source licence (indicated Creative Commons / free), the "Levenshtein Distance" algorithm, the applet or snippet (converter, solver, encryption / decryption, encoding / decoding, ciphering / deciphering, breaker, translator), or the "Levenshtein Distance" functions (calculate, convert, solve, decrypt / encrypt, decipher / cipher, decode / encode, translate) written in any informatic language (Python, Java, PHP, C#, Javascript, Matlab, etc.) and all data download, script, or API access for "Levenshtein Distance" are not public, same for offline use on PC, mobile, tablet, iPhone or Android app!
Reminder : dCode is free to use.
The copy-paste of the page "Levenshtein Distance" or any of its results, is allowed (even for commercial purposes) as long as you credit dCode!
Exporting results as a .csv or .txt file is free by clicking on the export icon
Cite as source (bibliography):
Levenshtein Distance on dCode.fr [online website], retrieved on 2024-12-04,