Alexandre Labadié scite author profile

2007

Information retrieval needs to match relevant texts with a given query. Selecting appropriate parts is useful when documents are long, and only portions are interesting to the user. In this paper, we describe a method that extensively uses natural language techniques for text segmentation based on topic change detection. The method requires a NLP-parser and a semantic representation in Roget-based vectors. We have run the experiment on French documents, for which we have the appropriate tools, but the method could be transposed to any other language with the same requirements. The article sketches an overview of the NL understanding environment functionalities, and the algorithms related to our text segmentation method. An experiment in text segmentation is also presented and its result in an information retrieval task is shown.

Lexical and Semantic Methods in Inner Text Topic Segmentation: A Comparison between C99 and Transeg

This paper present a semantic and syntactic distance based method in topic text segmentation and compare it to a very well known text segmentation algorithm: c99. To do so we ran the two algorithms on a corpus of twenty two French political discourses and compared their results. Our two conclusions are that the two approaches are complementary and that evaluation methods in this domain should be revised.

Finding Text Boundaries and Finding Topic Boundaries: Two Different Tasks?

2008

The goal of this paper is to demonstrate that usual evaluation methods for text segmentation are not adapted for every task linked to text segmentation. To do so we differentiated the task of finding text boundaries in a corpus of concatenated texts from the task of finding transitions between topics inside the same text. We worked on a corpus of twenty two French political discourses trying to find boundaries between them when they are concatenated, and to find topic boundaries inside them when they are not. We compared the results of our distance based method to the well known c99 algorithm.

Intended boundaries detection in topic change tracking for text segmentation

2008

Int J Speech Technol

Personalized Semantic Resources - The SemComp Project Presentation and Preliminary Works

Labadié¹,

Ferrari²,

Roy³

2013

This paper presents the computational aspects of the SemComp project, a multidisciplinary collaboration aiming at observing how interacting with documents acts on knowledge acquisition. It is based on a model for personalized semantic resources inspired from componential linguistics. The paper describes the advances in both the computational model's definition as well as its implementation in a Web oriented application. Functionalities and technical choices are presented with regards to the expected experiments.