Shi, Qinfeng; Altun, Yasemin; Smola, Alexander; Vishwanathan, S
In this paper, we study the problem of automatically segmenting written text into paragraphs. This is inherently a sequence labeling problem, however, previous approaches ignore this dependency. We propose a novel approach for automatic paragraph segmentation, namely training Semi-Markov models discriminatively using a Max-Margin method. This method allows us to model the sequential nature of the problem and to incorporate features of a whole paragraph, such as paragraph coherence which cannot...[Show more]
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.