Automated Dating of the World's Language Families Based on Lexical Similarity

Date

2011

Authors

Holman, Eric W.
Brown, Cecil H.
Wichmann, Soren
Muller, Andre
Velupillai, Viveka
Hammarstrom, Harald
Sauppe, Sebastian
Jung, Hagen
Bakker, Dik
Brown, Pamela

Journal Title

Journal ISSN

Volume Title

Publisher

University of Chicago Press

Abstract

This paper describes a computerized alternative to glottochronology for estimating elapsed time since parent languages diverged into daughter languages. The method, developed by the Automated Similarity Judgment Program (ASJP) consortium, is different from glottochronology in four major respects: (1) it is automated and thus is more objective, (2) it applies a uniform analytical approach to a single database of worldwide languages, (3) it is based on lexical similarity as determined from Levenshtein (edit) distances rather than on cognate percentages, and (4) it provides a formula for date calculation that mathematically recognizes the lexical heterogeneity of individual languages, including parent languages just before their breakup into daughter languages. Automated judgments of lexical similarity for groups of related languages are calibrated with historical, epigraphic, and archaeological divergence dates for 52 language groups. The discrepancies between estimated and calibration dates are found to be on average 29% as large as the estimated dates themselves, a figure that does not differ significantly among language families. As a resource for further research that may require dates of known level of accuracy, we offer a list of ASJP time depths for nearly all the world's recognized language families and for many subfamilies.

Description

Keywords

Citation

Source

Current Anthropology

Type

Journal article

Book Title

Entity type

Access Statement

License Rights

Restricted until

2037-12-31