Semi-Markov models for sequence segmentation
Date
Authors
Shi, Qinfeng
Altun, Yasemin
Smola, Alexander
Vishwanathan, S
Journal Title
Journal ISSN
Volume Title
Publisher
OmniPress
Abstract
In this paper, we study the problem of automatically segmenting written text into paragraphs. This is inherently a sequence labeling problem, however, previous approaches ignore this dependency. We propose a novel approach for automatic paragraph segmentation, namely training Semi-Markov models discriminatively using a Max-Margin method. This method allows us to model the sequential nature of the problem and to incorporate features of a whole paragraph, such as paragraph coherence which cannot be used in previous models. Experimental evaluation on four text corpora shows improvement over the previous state-of-the art method on this task.
Description
Citation
Collections
Source
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007)
Type
Book Title
Entity type
Access Statement
License Rights
DOI
Restricted until
2037-12-31