Word Classes in Indonesian: A Linguistic Reality or a Convenient Fallacy in Natural Language Processing?
Date
2012
Authors
Mistica, Meladel
Baldwin, Timothy
Arka, I Wayan
Journal Title
Journal ISSN
Volume Title
Publisher
ACL Anthology
Abstract
This paper looks at Indonesian (Bahasa Indonesia), and the claim that there is no noun-verb distinction within the language as it is spoken in regions such as Riau and Jakarta. We test this claim for the language as it is written by a variety of Indonesian speakers using empirical methods traditionally used in part-of-speech induction. In this study we use only morphological patterns that we generate from a pre-existing morphological analyser. We find that once the distribution of the data points in our experiments match the distribution of the text from which we gather our data, we obtain significant results that show a distinction between the class of nouns and the class of verbs in Indonesian. Furthermore it shows promise that the labelling of word classes may be achieved only with morphological features, which could be applied to out-of-vocabulary items.
Description
Keywords
Keywords: Data points; Empirical method; Indonesia; Indonesian; Jakarta; Morphological features; Morphological patterns; NAtural language processing; Part Of Speech; Word class; Word class induction; Computational linguistics; Morphology; Natural language processin Indonesian; Morphology; Word class; Word class induction
Citation
Collections
Source
Proceedings of PACLIC 25
Type
Conference paper
Book Title
Entity type
Access Statement
License Rights
DOI
Restricted until
2037-12-31