Open Research will be unavailable from 6pm to 6.30pm on Wednesday 10th December 2025 AEDT due to scheduled maintenance.
 

Clustering-Based Scalable Indexing for Multi-party Privacy

Date

Authors

Ranbaduge, Thilina
Vatsalan, Dinusha
Christen, Peter

Journal Title

Journal ISSN

Volume Title

Publisher

Springer International Publishing AG

Abstract

The identification of common sets of records in multiple databases has become an increasingly important subject in many application areas, including banking, health, and national security. Often privacy concerns and regulations prevent the owners of the databases from sharing any sensitive details of their records with each other, and with any other party. The linkage of records in multiple databases while preserving privacy and confidentiality is an emerging research discipline known as privacy-preserving record linkage (PPRL). We propose a novel two-step indexing (blocking) approach for PPRL between multiple (more than two) parties. First, we generate small mini-blocks using a multi-bit Bloom filter splitting method and second we merge these mini-blocks based on their similarity using a novel hierarchical canopy clustering technique. An empirical study conducted with large datasets of up-to one million records shows that our approach is scalable with the size of the datasets and the number of parties, while providing better privacy than previous multi-party indexing approaches.

Description

Keywords

Citation

Source

Efficient Interactive Training Selection for Large-Scale Entity Resolution

Book Title

Entity type

Access Statement

License Rights

Restricted until

2037-12-31