Word Classes in Indonesian: A Linguistic Reality or a Convenient Fallacy in Natural Language Processing?

Date

2012

Authors

Mistica, Meladel
Baldwin, Timothy
Arka, I Wayan

Journal Title

Journal ISSN

Volume Title

Publisher

ACL Anthology

Abstract

This paper looks at Indonesian (Bahasa Indonesia), and the claim that there is no noun-verb distinction within the language as it is spoken in regions such as Riau and Jakarta. We test this claim for the language as it is written by a variety of Indonesian speakers using empirical methods traditionally used in part-of-speech induction. In this study we use only morphological patterns that we generate from a pre-existing morphological analyser. We find that once the distribution of the data points in our experiments match the distribution of the text from which we gather our data, we obtain significant results that show a distinction between the class of nouns and the class of verbs in Indonesian. Furthermore it shows promise that the labelling of word classes may be achieved only with morphological features, which could be applied to out-of-vocabulary items.

Description

Keywords

Keywords: Data points; Empirical method; Indonesia; Indonesian; Jakarta; Morphological features; Morphological patterns; NAtural language processing; Part Of Speech; Word class; Word class induction; Computational linguistics; Morphology; Natural language processin Indonesian; Morphology; Word class; Word class induction

Citation

Source

Proceedings of PACLIC 25

Type

Conference paper

Book Title

Entity type

Access Statement

License Rights

DOI

Restricted until

2037-12-31