Saturday, February 18, 2012

PostgreSQL - fuzzystrmatch . distance between strings

I think this feature could be useful for data cleansing, or in general, for tasks related with string comparisons.

F.9.2. Levenshtein
This function calculates the Levenshtein distance between two strings:
   levenshtein(text source, text target) returns int
Both source and target can be any non-null string, with a maximum of 255 characters.

test=# Create extension fuzzystrmatch;

test=# select levenshtein('john smith','john schmit');
(1 row)

Calculating their degree of similarity between two words sounds easy... but I'll try that after a long nap. One thing that would be awesome, is to somehow implement an efficient auto-complete feature using postgresql...

No comments:

Post a Comment