Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics - EACL '03 2003
DOI: 10.3115/1067807.1067831
|View full text |Cite
|
Sign up to set email alerts
|

Linear text segmentation using a dynamic programming algorithm

Abstract: In this paper we introduce a dynamic programming algorithm to perform linear text segmentation by global minimization of a segmentation cost function which consists of: (a) within-segment word similarity and (b) prior information about segment length. The evaluation of the segmentation accuracy of the algorithm on Choi's text collection showed that the algorithm achieves the best segmentation accuracy so far reported in the literature.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
20
0
2

Year Published

2005
2005
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(22 citation statements)
references
References 16 publications
0
20
0
2
Order By: Relevance
“…Other researchers have adopted a variety of other approaches, for example: peak finding in a lexical cohesion curve (Hearst, 1997), minimization of an ad hoc segmentation cost function (Kehagias, Pavlina, & Petridis, 2003), converting the text segmentation problem to one of image segmentation then applying techniques from image processing (Ji & Zha, 2003), and using affinity propagation in factor graphs (Kazantseva & Szpakowicz, 2011 The application of statistical and computational methods to problems in authorship analysis has been the focus of much study. Koppel et al (Koppel, Schler, & Argamon, 2009) surveyed this line of work, 19 focused on three specific types of problems, and discussed how machine learning methods can be applied to those problems.…”
Section: Discussionmentioning
confidence: 99%
“…Other researchers have adopted a variety of other approaches, for example: peak finding in a lexical cohesion curve (Hearst, 1997), minimization of an ad hoc segmentation cost function (Kehagias, Pavlina, & Petridis, 2003), converting the text segmentation problem to one of image segmentation then applying techniques from image processing (Ji & Zha, 2003), and using affinity propagation in factor graphs (Kazantseva & Szpakowicz, 2011 The application of statistical and computational methods to problems in authorship analysis has been the focus of much study. Koppel et al (Koppel, Schler, & Argamon, 2009) surveyed this line of work, 19 focused on three specific types of problems, and discussed how machine learning methods can be applied to those problems.…”
Section: Discussionmentioning
confidence: 99%
“…En cela, l'ASL permet d'aller au-delà du modèle classique vectoriel (Manning et Schütze, 1999 : 539 sq .) équemment employé pour mesurer la cohésion lexicale entre des phrases (Hearst, 1997 ;Choi, 2000 ;Kehagias, Pavlina L'ASL n'est pas la seule technique proposée pour répondre à ces problèmes (voir, par exemple, Morris et Hirst, 1991 ;Kozima, 1993 ;Ferret, 2002).…”
unclassified
“…et Petridis, 2003). Dans le modèle vectoriel, la similarité entre deux phrases est basée uniquement sur les mots communs.…”
unclassified
“…An extensive discussion of precisely the same problem addressed here, but with a different approach to its solution, is in [3], [4]. Work by Hubert [10], [11], with applications to meteorology, influenced Kehagias and co-workers [8], [15], [16], [17], [18], [19], who developed a dynamic programming algorithm much like ours, for applications such as text segmentation (see also [9]), where the raw data are provided in the form of a similarity matrix. [22] gives an O(kN 2 ) dynamic programming algorithm for finding the optimal partition of an interval into k blocks, for a given k. See also [20], [2] for related work.…”
Section: Introduction: the Problemmentioning
confidence: 99%