The detection of topics from large textual data volumes is currently a research area, which has many applications in the development of computational systems. A proposed solution for the detection of topics in data mining is the application of clustering methods. This paper presents the application of a new ontology‐based methodology for the automatic topic detection without any previous information based on the use of hierarchical clustering algorithms and a multilingual knowledge base. The approach also includes lexical resources that allow us to enrich the semantics of the analyzed texts. The novelty of this approach consists of the dimensionality reduction of the terms present in the texts by using ontology and the introduction of a method for the creation of a term weight matrix for use in clustering algorithms. With this approach, it is possible to improve automatic topic detection in documents. The proposed methodology was assessed with four datasets (two of them in English and two in Spanish).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.