Tool to calculate the Levenshtein distance between 2 words (character string) and search for related words in the dictionary or in a list.

Levenshtein Distance - dCode

Tag(s) : Data Processing

dCode is free and its tools are a valuable help in games, maths, geocaching, puzzles and problems to solve every day!

A suggestion ? a feedback ? a bug ? an idea ? *Write to dCode*!

The Levenshtein distance is an algorithmic method allowing to quantify a distance between two words (more generally between 2 strings of characters). Two close words (that is to say that few things separate them spelling: they have several letters in common, in the same position) will then have a small distance, while two very different words will have a large distance. This distance is also called edit distance and is equal to the minimum number of characters to be deleted, inserted, or replaced to move from one string to another.

__Example:__ `TAKE` and `MAKE` are similar graphically (a kind of near-homograph) and have a Levenshtein distance of 1

__Example:__ `CLOSE` and `CLOTHES` are similar phonetically (homophone) but orthographically far apart, they have a Levenshtein distance of 3

The distance from Levenshtein is symmetrical, the distance value from `STRING1` to `STRING2` is equal to the distance value from `STRING2` to `STRING1`

Levenshtein's algorithm evaluates the number of differences between the two character strings, the differences can be of 3 types: a substitution (replacement of one character by another), an insertion (addition of a new character) or a deletion (deletion of a character).

$$ \operatorname{Distance}(a,b) = \begin{cases} ||a|| & \text{ if } ||b|| = 0, \\ ||b|| & \text{ if } ||a|| = 0, \\ \operatorname{Distance}(a', b') & \text{ if } a[0] = b[0] \\ 1 + \min \begin{cases} \operatorname{Distance}(a', b) \\ \operatorname{Distance}(a, b') \\ \operatorname{Distance}(a', b') \\ \end{cases} & \text{ otherwise } \end{cases} $$

With $ ||a|| $ the size of the string, and $ a' $ the string $ a $ deprived of its first character (noted $ a[0] $)

The Damerau-Levenshtein distance is similar to the Levenshtein distance but adds a *transposition of 2 adjacent characters* type difference (which is a common typographical error).

The Levenshtein distance measures the similarity between two strings of characters.

__Example:__ `DCODE` is at a distance of `2` from `DECODER` (1- add `E` and 2- add `R`)

__Example:__ `DECODER` is at a distance of `2` from `DCODE` (1- remove `E` and 2- remove `R`)

Use the dCode tool by entering a word and a dictionary with which to compare the word.

All words with a similar spelling will be returned.

Calculating the distance between 2 strings of characters can make it possible to know their quantity of differences and thus also their amount of similarity. So the Levenshtein calculation algorithm can be used to determine typing errors or misspellings, words for which the proximity of the compared chains is strong.

The best known algorithm for calculating the Levenshtein distance was created by Wagner and Fischer in 1974. `// Pseudo-code`

function levenshteinDistance(str1, str2) {

size1 = length(str1)

size2 = length(str2)

matrix = [ size1 + 1 ] x [ size2 + 1 ]

for i from 0 to size1 { matrix[i][0] = i }

for j from 0 to size2 { matrix[0][j] = j }

for i from 1 to size1 {

for j from 1 to size2 {

if (str1[i - 1] == str2[j - 1]) cost = 0

else cost = 1

matrix[i][j] = minimum( matrix[i - 1][j] + 1, matrix[i][j - 1] + 1, matrix[i - 1][j - 1] + cost )

}

}

return matrix[size1][size2]

}

dCode retains ownership of the "Levenshtein Distance" source code. Except explicit open source licence (indicated Creative Commons / free), the "Levenshtein Distance" algorithm, the applet or snippet (converter, solver, encryption / decryption, encoding / decoding, ciphering / deciphering, breaker, translator), or the "Levenshtein Distance" functions (calculate, convert, solve, decrypt / encrypt, decipher / cipher, decode / encode, translate) written in any informatic language (Python, Java, PHP, C#, Javascript, Matlab, etc.) and all data download, script, or API access for "Levenshtein Distance" are not public, same for offline use on PC, mobile, tablet, iPhone or Android app!

Reminder : dCode is free to use.

The copy-paste of the page "Levenshtein Distance" or any of its results, is allowed (even for commercial purposes) as long as you credit dCode!

Exporting results as a .csv or .txt file is free by clicking on the *export* icon

Cite as source (bibliography):

*Levenshtein Distance* on dCode.fr [online website], retrieved on 2024-06-24,

- Distance Measurement
- What is the Levenshtein distance? (Definition)
- How does Levenshtein distance work?
- What is the Damerau-Levenshtein distance? (Definition)
- How to measure the distance between 2 words?
- How to find a word close to another?
- What is Levenshtein distance used for?
- What is the algorithm for programming a Levenshtein distance?

levenshtein,distance,damerau,measure,edit,similar,similarity,word,spelling,typo,close,proximity,phonetic

https://www.dcode.fr/levenshtein-distance

© 2024 dCode — El 'kit de herramientas' definitivo para resolver todos los juegos/acertijos/geocaching/CTF.

Feedback