Semi-Markov models for sequence segmentation

Date

Authors

Shi, Qinfeng
Altun, Yasemin
Smola, Alex
Vishwanathan, S. V.N.

Journal Title

Journal ISSN

Volume Title

Publisher

Access Statement

Research Projects

Organizational Units

Journal Issue

Abstract

In this paper, we study the problem of automatically segmenting written text into paragraphs. This is inherently a sequence labeling problem, however, previous approaches ignore this dependency. We propose a novel approach for automatic paragraph segmentation, namely training Semi-Markov models discriminatively using a Max-Margin method. This method allows us to model the sequential nature of the problem and to incorporate features of a whole paragraph, such as paragraph coherence which cannot be used in previous models. Experimental evaluation on four text corpora shows improvement over the previous state-of-the art method on this task.

Description

Keywords

Citation

Source

Book Title

Entity type

Publication

Access Statement

License Rights

DOI

Restricted until