ANU Open Research Repository has been upgraded. We are still working out a few issues, and there may be periodic outages throughout the day. Please get in touch with repository.admin@anu.edu.au if you experience any issues.
 

Clustering-Based Scalable Indexing for Multi-party Privacy

Date

2015

Authors

Ranbaduge, Thilina
Vatsalan, Dinusha
Christen, Peter

Journal Title

Journal ISSN

Volume Title

Publisher

Springer International Publishing AG

Abstract

The identification of common sets of records in multiple databases has become an increasingly important subject in many application areas, including banking, health, and national security. Often privacy concerns and regulations prevent the owners of the databases from sharing any sensitive details of their records with each other, and with any other party. The linkage of records in multiple databases while preserving privacy and confidentiality is an emerging research discipline known as privacy-preserving record linkage (PPRL). We propose a novel two-step indexing (blocking) approach for PPRL between multiple (more than two) parties. First, we generate small mini-blocks using a multi-bit Bloom filter splitting method and second we merge these mini-blocks based on their similarity using a novel hierarchical canopy clustering technique. An empirical study conducted with large datasets of up-to one million records shows that our approach is scalable with the size of the datasets and the number of parties, while providing better privacy than previous multi-party indexing approaches.

Description

Keywords

Citation

Source

Efficient Interactive Training Selection for Large-Scale Entity Resolution

Type

Conference paper

Book Title

Entity type

Access Statement

License Rights

DOI

10.1007/978-3-319-18032-8_43

Restricted until

2037-12-31