Crowd-sourced Text Analysis: Reproducible and Agile Production of Political Data
Date
2016
Authors
Benoit, Kenneth
Conway, Drew
Lauderdale, Benjamin E
Laver, Michael
Mikhaylov, Slava
Journal Title
Journal ISSN
Volume Title
Publisher
Cambridge University Press
Abstract
Empirical social science often relies on data that are not observed in the field, but are
transformed into quantitative variables by expert researchers who analyze and interpret
qualitative raw sources. While generally considered the most valid way to produce data, this
expert-driven process is inherently difficult to replicate or to assess on grounds of reliability.
Using crowd-sourcing to distribute text for reading and interpretation by massive numbers of
non-experts, we generate results comparable to those using experts to read and interpret the same
texts, but do so far more quickly and flexibly. Crucially, the data we collect can be reproduced
and extended transparently, making crowd-sourced datasets intrinsically reproducible. This
focuses researchers’ attention on the fundamental scientific objective of specifying reliable and
replicable methods for collecting the data needed, rather than on the content of any particular
dataset. We also show that our approach works straightforwardly with different types of political
text, written in different languages. While findings reported here concern text analysis, they have
far-reaching implications for expert-generated data in the social sciences.
Description
Keywords
Citation
Collections
Source
American Political Science Review
Type
Journal article
Book Title
Entity type
Access Statement
License Rights
Restricted until
2037-12-31