The potential of automatic word comparison for historical linguistics

dc.contributor.authorList, Johann Mattis
dc.contributor.authorGreenhill, Simon
dc.contributor.authorGray, Russell D
dc.date.accessioned2021-10-01T00:41:04Z
dc.date.available2021-10-01T00:41:04Z
dc.date.issued2017
dc.date.updated2020-11-23T11:19:34Z
dc.description.abstractThe amount of data from languages spoken all over the world is rapidly increasing. Traditional manual methods in historical linguistics need to face the challenges brought by this influx of data. Automatic approaches to word comparison could provide invaluable help to pre-analyze data which can be later enhanced by experts. In this way, computational approaches can take care of the repetitive and schematic tasks leaving experts to concentrate on answering interesting questions. Here we test the potential of automatic methods to detect etymologically related words (cognates) in cross-linguistic data. Using a newly compiled database of expert cognate judgments across five different language families, we compare how well different automatic approaches distinguish related from unrelated words. Our results show that automatic methods can identify cognates with a very high degree of accuracy, reaching 89% for the best-performing method Infomap. We identify the specific strengths and weaknesses of these different methods and point to major challenges for future approaches. Current automatic approaches for cognate detection-although not perfect -could become an important component of future research in historical linguistics.en_AU
dc.description.sponsorshipAs part of the GlottoBank Project, this work was supported by the Max Planck Institute for the Science of Human History and the Royal Society of New Zealand Marsden Fund grant 13¬UOA-121. This paper was further supported by the DFG research fellowship grant 261553824 “Vertical and lateral aspects of Chinese dialect history”(JML), and the Australian Research Council’s Discovery Projects funding scheme (project number DE120101954, SJG).en_AU
dc.format.mimetypeapplication/pdfen_AU
dc.identifier.issn1932-6203en_AU
dc.identifier.urihttp://hdl.handle.net/1885/249103
dc.language.isoen_AUen_AU
dc.provenanceThis is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.en_AU
dc.publisherPublic Library of Scienceen_AU
dc.relationhttp://purl.org/au-research/grants/arc/DE120101954en_AU
dc.rights© 2017 List et al.en_AU
dc.rights.licenseCreative Commons License (Attribution 4.0 International)en_AU
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_AU
dc.sourcePLOS ONE (Public Library of Science)en_AU
dc.titleThe potential of automatic word comparison for historical linguisticsen_AU
dc.typeJournal articleen_AU
dcterms.accessRightsOpen Accessen_AU
local.bibliographicCitation.issue1en_AU
local.bibliographicCitation.lastpagee0170046en_AU
local.bibliographicCitation.startpagee0170046en_AU
local.contributor.affiliationList, Johann Mattis, EHESSen_AU
local.contributor.affiliationGreenhill, Simon, College of Asia and the Pacific, ANUen_AU
local.contributor.affiliationGray, Russell D, Max Planck Institute for the Science of Human Historyen_AU
local.contributor.authoremailu5232172@anu.edu.auen_AU
local.contributor.authoruidGreenhill, Simon, u5232172en_AU
local.description.notesImported from ARIESen_AU
local.identifier.absfor200499 - Linguistics not elsewhere classifieden_AU
local.identifier.ariespublicationa383154xPUB6398en_AU
local.identifier.citationvolume12en_AU
local.identifier.doi10.1371/journal.pone.0170046en_AU
local.identifier.scopusID2-s2.0-85011094636
local.identifier.thomsonID000396211400065
local.identifier.uidSubmittedBya383154en_AU
local.publisher.urlhttp://www.plos.org/en_AU
local.type.statusPublished Versionen_AU

Downloads

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
01_List_The_Potential_of_Automatic_2017.pdf
Size:
4.54 MB
Format:
Adobe Portable Document Format