Blockland Forums > Modification Help
[Resource] Damerau Levenshtein String Distance
Swollow:
--- Code: ---function damerauLevenshteinDistance(%a,%b)
{
%h = strLen(%a)+1;
%w = strLen(%b)+1;
for(%i=0;%i<%h;%i++)
%a[%i] = getSubStr(%a,%m[%i,0] = %i,1);
for(%i=0;%i<%w;%i++)
%b[%i] = getSubStr(%b,%m[0,%i] = %i,1);
for(%y=1;%y<%h;%y++)
{
for(%x=1;%x<%w;%x++)
{
%c = !(%a[%y-1] $= %b[%x-1]);
%i = %m[%y,%x-1]+1;
%d = %m[%y-1,%x]+1;
%s = %m[%y-1,%x-1]+%c;
if(%s < %d && %s < %i)
%l = %s;
else if(%d < %i)
%l = %d;
else
%l = %i;
if(%x > 1 && %y > 1 && %a[%y-1] $= %b[%x-2] && %a[%y-2] $= %b[%x-1] && (%t = %m[%y-2,%x-2]+%c) < %l)
%l = %t;
%m[%y,%x] = %l;
}
}
return %m[%h-1,%w-1];
}
--- End code ---
find the distance between two strings as a numerical value
includes insertion, deletion, substitution, and transpositions
respectively each method is shown below, each of these strings are considered a distance of '1' from eachother
damerauLevenshteinDistance("abc","abbc");
damerauLevenshteinDistance("abc","ac");
damerauLevenshteinDistance("abc","axc");
damerauLevenshteinDistance("abc","aabc");
this function clearly becomes slower the longer the strings so make sure to include some sort of limiter and cache results
do not allow a user to enter multiple 255 length strings concatenated together in a serverCmd and then parse them through this function, your server will explode
if you expect this function to be called many times for similar strings consider adding a simple global variable cache array
PhantOS:
me: aw yea I can evade the censor list! fkc u dumpazz
damerau-levenshtein distance algorithm: allow us to introduce ourselves
Swollow:
this would actually be a very poor algorithm for censor
demonstrated by the words below which are all a distance of '1' from a word that would be found on a censor list
--- Code: ---fog
fig
hag
gag
tag
rag
sag
jag
mag
bag
lag
nag
shot
shut
tuck
muck
suck
ruck
buck
cuck
duck
huck
luck
puck
yuck
bigger
digger
nagger
rigger
tigger
--- End code ---
this algorithm is more useful when trying to match a string to a specific list ie a user list or a item list brick list etc
further more this algorithm is very computational and is not something you would want to run on every multi word message sent by a user against a censor list
PhantOS:
is there no phonetic comparison algorithm out there
Aide33:
--- Quote from: PhantOS on June 26, 2019, 10:07:00 PM ---is there no phonetic comparison algorithm out there
--- End quote ---
https://en.wikipedia.org/wiki/Phonetic_algorithm
Navigation
[0] Message Index
[#] Next page
Go to full version