Multiple Instance Learning for Group Record Linkage

dc.contributor.authorFu, Sally
dc.contributor.authorZhou, Jun
dc.contributor.authorChristen, Peter
dc.contributor.authorBoot, Hector
dc.date.accessioned2015-12-10T22:58:13Z
dc.date.issued2012
dc.date.updated2016-02-24T11:59:35Z
dc.description.abstractRecord linkage is the process of identifying records that refer to the same entities from different data sources. While most research efforts are concerned with linking individual records, new approaches have recently been proposed to link groups of records across databases. Group record linkage aims to determine if two groups of records in two databases refer to the same entity or not. One application where group record linkage is of high importance is the linking of census data that contain household information across time. In this paper we propose a novel method to group record linkage based on multiple instance learning. Our method treats group links as bags and individual record links as instances. We extend multiple instance learning from bag to instance classification to reconstruct bags from candidate instances. The classified bag and instance samples lead to a significant reduction in multiple group links, thereby improving the overall quality of linked data. We evaluate our method with both synthetic data and real historical census data.
dc.identifier.isbn9783642302176
dc.identifier.urihttp://hdl.handle.net/1885/60764
dc.publisherSpringer
dc.relation.ispartofAdvances in Knowledge Discovery and Data Mining: 16th Pacific-Asia Conference, PKDD 2012: Kuala Lumpur, Malaysia, May 29 - June 1, 2012: Proceedings, Part I
dc.relation.isversionof1st Edition
dc.subjectKeywords: Across time; Census data; Data source; Linked datum; Multiple instance learning; Multiple-group; Overall quality; Record linkage; Research efforts; Synthetic data; Data mining; Learning systems; Population statistics; Data handling entity resolution; historical census data; instance classification; Multiple instance learning; record linkage
dc.titleMultiple Instance Learning for Group Record Linkage
dc.typeBook chapter
local.bibliographicCitation.lastpage182
local.bibliographicCitation.placeofpublicationBerlin Germany
local.bibliographicCitation.startpage171
local.contributor.affiliationFu, Sally, College of Engineering and Computer Science, ANU
local.contributor.affiliationZhou, Jun, College of Engineering and Computer Science, ANU
local.contributor.affiliationChristen, Peter, College of Engineering and Computer Science, ANU
local.contributor.affiliationBoot, Hector, College of Arts and Social Sciences, ANU
local.contributor.authoruidFu, Sally, u4802791
local.contributor.authoruidZhou, Jun, u1818501
local.contributor.authoruidChristen, Peter, u4021539
local.contributor.authoruidBoot, Hector, u7000502
local.description.embargo2037-12-31
local.description.notesImported from ARIES
local.description.refereedYes
local.identifier.absfor080109 - Pattern Recognition and Data Mining
local.identifier.absfor080306 - Open Software
local.identifier.absfor160305 - Population Trends and Policies
local.identifier.absseo970108 - Expanding Knowledge in the Information and Computing Sciences
local.identifier.absseo970116 - Expanding Knowledge through Studies of Human Society
local.identifier.absseo970121 - Expanding Knowledge in History and Archaeology
local.identifier.ariespublicationu9406909xPUB561
local.identifier.doi10.1007/978-3-642-30217-6_15
local.identifier.scopusID2-s2.0-84861452098
local.type.statusPublished Version

Downloads

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
01_Fu_Multiple_Instance_Learning_for_2012.pdf
Size:
241.32 KB
Format:
Adobe Portable Document Format