Regression classification for Improved Temporal Record Linkage
dc.contributor.author | Wang, Qing | |
dc.contributor.author | Vatsalan, Dinusha | |
dc.contributor.author | Christen, Peter | |
dc.contributor.author | Hu, Yichen | |
dc.contributor.editor | Estivill-Castro, V. | |
dc.contributor.editor | Simoff, S. | |
dc.coverage.spatial | Canberra, Australia | |
dc.date.accessioned | 2024-02-20T00:21:12Z | |
dc.date.created | December 6-8 2016 | |
dc.date.issued | 2016 | |
dc.date.updated | 2022-10-02T07:20:20Z | |
dc.description.abstract | Temporal record linkage is the process of identifying groups of records which are collected over long periods of time, such as census databases or voter registration databases, that represent the same real-world entities. These datasets often contain temporal information for each record, such as the time when a record was created, or the time when it was modified. Unlike traditional record linkage, which treats differences between records from the same entity as errors or variations, temporal record linkage aims to capture records from entities where the details of these entities change over the time. This paper proposes a temporal record linkage approach that learns the probabilities for attribute values of records to change within different periods of time, which extends an existing temporal approach decay model. The proposed method uses a regression based machine learning model to predict decay with sets of attributes, where attribute values in each set could affect the decay of others. Our experimental results show that the proposed approach results in generally better recall than baseline approaches on real-world datasets. | en_AU |
dc.description.sponsorship | This work was partially funded by the Australian Research Council (ARC) under Discovery Project DP160101934. | en_AU |
dc.format.mimetype | application/pdf | en_AU |
dc.identifier.uri | http://hdl.handle.net/1885/313756 | |
dc.language.iso | en_AU | en_AU |
dc.publisher | Australasian Data Mining Conference | en_AU |
dc.relation | http://purl.org/au-research/grants/arc/DP160101934 | en_AU |
dc.relation.ispartofseries | Australasian Data Mining Conference (AusDM 2016) | en_AU |
dc.rights | Copyright © 2016, Australian Computer Society, Inc. | en_AU |
dc.source | Conferences in Research and Practice in Information Technology | en_AU |
dc.subject | Data matching | en_AU |
dc.subject | entity resolution | en_AU |
dc.subject | record linkage | en_AU |
dc.subject | temporal data | en_AU |
dc.title | Regression classification for Improved Temporal Record Linkage | en_AU |
dc.type | Conference paper | en_AU |
local.bibliographicCitation.lastpage | 10 | en_AU |
local.bibliographicCitation.startpage | 1 | en_AU |
local.contributor.affiliation | Wang, Qing, College of Engineering and Computer Science, ANU | en_AU |
local.contributor.affiliation | Vatsalan, Dinusha, College of Engineering and Computer Science, ANU | en_AU |
local.contributor.affiliation | Christen, Peter, College of Engineering and Computer Science, ANU | en_AU |
local.contributor.affiliation | Hu, Yichen, College of Engineering and Computer Science, ANU | en_AU |
local.contributor.authoremail | u4021539@anu.edu.au | en_AU |
local.contributor.authoruid | Wang, Qing, u5170295 | en_AU |
local.contributor.authoruid | Vatsalan, Dinusha, u4908149 | en_AU |
local.contributor.authoruid | Christen, Peter, u4021539 | en_AU |
local.contributor.authoruid | Hu, Yichen, u5986120 | en_AU |
local.description.embargo | 2099-12-31 | |
local.description.notes | Imported from ARIES | en_AU |
local.description.refereed | Yes | |
local.identifier.absfor | 460507 - Information extraction and fusion | en_AU |
local.identifier.absfor | 460504 - Data quality | en_AU |
local.identifier.absfor | 460502 - Data mining and knowledge discovery | en_AU |
local.identifier.ariespublication | u6048437xPUB390 | en_AU |
local.identifier.uidSubmittedBy | u6048437 | en_AU |
local.publisher.url | https://ausdm.org/archive/ausdm16/index.html | en_AU |
local.type.status | Published Version | en_AU |
Downloads
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Regression classifier for Improved Temporal Record Linkage.pdf
- Size:
- 635.61 KB
- Format:
- Adobe Portable Document Format
- Description: