Mining eighteenth century ontologies: Machine learning and knowledge classification in the encyclopédie

Date

Authors

Horton, Russell
Morrissey, Robert
Olsen, Mark
Roe, Glenn
Voyer, Robert

Journal Title

Journal ISSN

Volume Title

Publisher

The Alliance of Digital Humanities Organizations

Abstract

The Encyclopédie of Denis Diderot and Jean le Rond d'Alembert was one of the most important and revolutionary intellectual products of the French Enlightenment. Mobilizing many of the great – and the notsogreat – philosophes of the 18th century, the Encyclopédie was a massive reference work for the arts and sciences, which sought to organize and transmit the totality of human knowledge while at the same time serving as a vehicle for critical thinking. In its digital form, it is a highly structured corpus; some 55,000 of its 77,000 articles were labeled with classes of knowledge by the editors making it a perfect sandbox for experiments with supervised learning algorithms. In this study, we train a Naive Bayesian classifier on the labeled articles and use this model to determine class membership for the remaining articles. This model is then used to make binary comparisons between labeled texts from different classes in an effort to extract the most important features in terms of class distinction. Reapplying the model onto the original classified articles leads us to question our previous assumptions about the consistency and coherency of the ontology developed by the Encyclopedists. Finally, by applying this model to another corpus from 18th century France, the Journal de Trévoux, or Mémoires pour l'Histoire des Sciences & des BeauxArts, new light is shed on the domain of Literature as it was understood and defined by 18th century writers.

Description

Citation

Source

Digital Humanities Quarterly 3.2 (2009)

Book Title

Entity type

Access Statement

License Rights

DOI

Restricted until