Skip navigation
Skip navigation

Robust record linkage blocking using suffix arrays and bloom filters

De Vries, Timothy; Ke, Hui; Chawla, Sanjay; Christen, Peter


Record linkage is an important data integration task that has many practical uses for matching, merging and duplicate removal in large and diverse databases. However, quadratic scalability for the brute force approach of comparing all possible pairs of records necessitates the design of appropriate indexing or blocking techniques. The aim of these techniques is to cheaply remove candidate record pairs that are unlikely to match. We design and evaluate an eficient and highly scalable blocking...[Show more]

CollectionsANU Research Publications
Date published: 2011
Type: Journal article
Source: ACM Transactions on Knowledge Discovery from Data
DOI: 10.1145/1921632.1921635


File Description SizeFormat Image
01_De Vries_Robust_record_linkage_blocking_2011.pdf529.93 kBAdobe PDF    Request a copy

Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  20 July 2017/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator