You are here: Tools > All Tools > Fuzzy Match

Fuzzy Match

The Fuzzy Matching tool can be used to identify non-identical duplicates of a database by specifying parameters to match on. Values need not be exact to find a match, they just need to fall within the user specified or prefabricated parameters set forth in the configuration properties.

The most effective way to configure the Fuzzy Match tool is to assign the match process to multiple fields within the input file. Each field should be individually configured using either a predefined or custom Match Style, configured through the Edit Match Options.

Fuzzy matching only works with Latin character sets, and some of the match capabilities are only compatible with English language.

Configuration Properties:

The input stream of data MUST include a unique identifier for each record. If there is no such key field in the input, add a RecordID tool one step upstream.

  1. Choose the preferred mode to apply the Fuzzy Match tool to. Choices are:

  2. Specify the unique Record ID field. This field must be unique to each individual record and be unique across different sources. A record ID can be easily appended to each input via the Record ID tool.

  3. Specify the Match Threshold. Default is 80%. If the Match score generated from the Fuzzy Match tool is less than the specified threshold, the record will not qualify as a match.

  4. Select the Field Name to Match on. Any field already in the input file will be available from this drop down list

  5. Select the Match Style from the drop down list. Choices include:

  6. Edit the Match Style as necessary, by clicking the Edit button. The Edit Match Options dialog will display.

  7. Specify additional output fields and settings:

Click Apply to have the configurations accepted.

For information regarding Input, Output, Annotation and Error Properties, see Tool Properties.

Related Topics Link IconRelated Topics