Fuzzy Matching: Solving Your Customer Data Cleaning and Linking issues once and for all
As marketers in the digital age we understand the importance of analytics, the quality of which is dependent on simply good,clean, data. If you are a marketer and you’re not worried about the quality of your customer data, you should be. According to Forbes Insights the vast majority of CEOs are concerned about the quality of their data. Research suggests that up to 40% of businesses are bleeding value to their bottom line because of improperly linked and misaligned customer data across various systems and platforms.
The more common problems caused by poor quality data include:
Many of these problems are caused by or result in data duplication and record linkage problems which are highly related issues. Not surprisingly, the techniques we can employ to identify matching records for both the cleansing of and removal of duplication / linkage issues are the same.
Technopedia defines data enrichment as “a general term that refers to processes used to enhance, refine or otherwise improve raw data”. The methods employed to perform successful data enrichment typically fall into one of two categories: automated or manual. The tools leveraged within each data enrichment category can be further classified into the following: ETL, Adding / Gap filling / matching, and Big data. Manual data enrichment as you would expect is the oldest method of doing data enrichment and frankly it is still the most trustworthy method of handling the more intricate use cases involving data quality in my opinion. People still (atleast at the moment) are better at spotting the more complicated, anomalous data integrity issues that machines haven’t mastered (yet). Use cases for manual data enrichment are still necessary and critical to the overall data enrichment process, particularly in the Adding / Gap filling / and matching space. Enter the Fuzzy lookup.
Fuzzy Lookup: An underutilized, extremely powerful, and FREE solution
As opposed to the typical and binary “Yes it matched” or “Sorry try again” approach to data matching, this powerful and free excel plug-in identifies the probability that two records represent the same item based on a number of identifying factors. The tool allows you to choose confidence weight or “similarity threshold” which allows you to optimize your matching capabilities and limit false positives, which are really the only issue encountered with fuzzy matching.
Historically this type of capability was considered an extremely complex, very costly (think six figures plus), and it would typically take months or even years to deliver a solution with a reasonable ROI. However,those days are long gone as advancements in data quality software have accelerated over recent years.
The Fuzzy Lookup add-in for Excel performs fuzzy matching of text-based data in Excel.
This functionality can be leveraged to fix difficult problems like weeding out duplicates within a table where the duplicates TRULY ARE duplicates but don’t match exactly or to “fuzzy join” similar data between different tables.
The add-in is extremely powerful, especially for someone who has only utilized VLOOKUP or similar functions in the past.
Other thanthe add-in, all you’re going to need are two or more data-sets that exist in different tabs of an excel file that you would like to compare and find matches within.
After your add-in has been installed you will notice that a “Fuzzy Lookup” item will now appear in the toolbar at the top of your Excel application. Clicking the tab opens this Fuzzy Lookup task pane
The process to start finding matches simply requires you to select one column from each table to create a fuzzy lookup between the values within those columns. Inessence, you’re matching rows by identifying similar items between these columns.
Is it starting to sink in just how powerful this tool can be? None of the values you’re trying to match need to be exactly the same or even very similar for fuzzy lookup to do its thing, the data just needs to be similar enough or relatively similar enough to identify possible matches in the data to be successfully made.
Joining imperfect data has never been easier. To get started with Fuzzy Lookups,install the add-in by following these instructions.
Data Enrichment is a necessary process for any organizations depending on the validity of their data and for the foreseeable future there is going to be room for humans to be in the loop with the assistance of tech like fuzzy lookup.
Although a broad description, this introduction to useful manual data enrichment techniques can be applied very specifically to data within your CRM or other application, but hopefully this helps you understand just some of the free and powerful tools organizations have at their disposal.