Semi-Markov models for sequence segmentation

Date

Authors

Shi, Qinfeng
Altun, Yasemin
Smola, Alexander
Vishwanathan, S

Journal Title

Journal ISSN

Volume Title

Publisher

OmniPress

Abstract

In this paper, we study the problem of automatically segmenting written text into paragraphs. This is inherently a sequence labeling problem, however, previous approaches ignore this dependency. We propose a novel approach for automatic paragraph segmentation, namely training Semi-Markov models discriminatively using a Max-Margin method. This method allows us to model the sequential nature of the problem and to incorporate features of a whole paragraph, such as paragraph coherence which cannot be used in previous models. Experimental evaluation on four text corpora shows improvement over the previous state-of-the art method on this task.

Description

Citation

Source

Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007)

Book Title

Entity type

Access Statement

License Rights

DOI

Restricted until

2037-12-31