Data Science

Fuzzy String Matching:

Finding strings that approximately match a pattern in your data using Python.

Fuzzy matching answers the question of “how similar are string A and B?” instead of the Boolean matching of “Are string A and B the same?”

Fuzzywuzzy is a Python library that uses Levenshtein Distance to calculate the differences betweem sequences and patterns.

References:

  1. Fuzzy String Matching

  2. Fuzzy String Matching in Python

Record Linking:

Record linking and fuzzy matching are terms used to describe the process of joining two data sets together that do not have a common unique identifier.

References:

  1. Python Tools for Record Linkinig and Fuzzy Matching