“…Those methods and techniques are both numerous and diverse, but there are some common characteristics. For instance, they typically involve various steps, which, together, form treatment chains [7], [15]: textual data are preprocessed (cleaning, lemmatisation, etc.) and transformed into suitable representations (e.g.…”
Section: Concept Detectionmentioning
confidence: 99%
“…In other disciplines of social science and humanities, the necessity of grounding analysis in corpora has lead researchers to harness text mining and natural language processing to improve their interpretations of textual data. Philosophy, however, has remained untouched by those developments, save for a few projects [3], [15].…”
Abstract. When performing a conceptual analysis of a concept, philosophers are interested in all forms of expression of a concept in a text-be it direct or indirect, explicit or implicit. In this paper, we experiment with topic-based methods of automating the detection of concept expressions in order to facilitate philosophical conceptual analysis. We propose six methods based on LDA, and evaluate them on a new corpus of court decision that we had annotated by experts and non-experts. Our results indicate that these methods can yield important improvements over the keyword heuristic, which is often used as a concept detection heuristic in many contexts. While more work remains to be done, this indicates that detecting concepts through topics can serve as a general-purpose method for at least some forms of concept expression that are not captured using naive keyword approaches.
“…Those methods and techniques are both numerous and diverse, but there are some common characteristics. For instance, they typically involve various steps, which, together, form treatment chains [7], [15]: textual data are preprocessed (cleaning, lemmatisation, etc.) and transformed into suitable representations (e.g.…”
Section: Concept Detectionmentioning
confidence: 99%
“…In other disciplines of social science and humanities, the necessity of grounding analysis in corpora has lead researchers to harness text mining and natural language processing to improve their interpretations of textual data. Philosophy, however, has remained untouched by those developments, save for a few projects [3], [15].…”
Abstract. When performing a conceptual analysis of a concept, philosophers are interested in all forms of expression of a concept in a text-be it direct or indirect, explicit or implicit. In this paper, we experiment with topic-based methods of automating the detection of concept expressions in order to facilitate philosophical conceptual analysis. We propose six methods based on LDA, and evaluate them on a new corpus of court decision that we had annotated by experts and non-experts. Our results indicate that these methods can yield important improvements over the keyword heuristic, which is often used as a concept detection heuristic in many contexts. While more work remains to be done, this indicates that detecting concepts through topics can serve as a general-purpose method for at least some forms of concept expression that are not captured using naive keyword approaches.
“…Nevertheless, while the phenomenon has remained marginal, philosophers have been interrogating texts with computers since the 1970s, be it through Meunier et al.‘s System for Text and Content Analysis (1976), or McKinnon's statistical profile of Kierkegaard's works (1970). Traditionally, text analysis was conceived mostly as a way to assist reading and interpretation, be it by developing algorithms to discover patterns that close reading would miss (Danis, 2012; Forest & Meunier, 2000; Meunier, Forest, and Biskri, 2005; Sainte‐Marie et al., 2011) or by using computational resources to exploit massive corpora (Chartier et al., 2008; Malaterre & Chartier, 2019). In such studies, researchers might train and apply a topic model and analyze how topics evolve across the years (e.g.…”
Research in experimental philosophy has increasingly been turning to corpus methods to produce evidence for empirical claims, as they open up new possibilities for testing linguistic claims or studying concepts across time and culture. The present article reviews the quasi‐experimental studies that have been done using textual data from corpora in philosophy, with an eye for the modeling and experimental design that enable statistical inference. I find that most studies forego comparisons that could control for confounds, and that only a little less than half employ statistical testing methods to control for chance results. Furthermore, at least some researchers make modeling decisions that either do not take into account the nature of corpora and of the word‐concept relationship, or undermine the experiment's capacity to answer research questions. I suggest that corpus methods could both provide more powerful evidence and gain more mainstream acceptance by improving their modeling practices.
“…Other voices have suggested that concepts could fruitfully be studied in textual corpora (Meunier, Biskri, and Forest 2005;Bluhm 2013;Andow 2016;Chartrand 2017). They argue that methods based on the distributional hypothesis, and that hail from subfields such as natural language processing, text mining and corpus linguistics, could shed light on at least some of the concepts that are objects of philosophical scrutiny.…”
In the last decades, philosophers have begun using empirical data for conceptual analysis, but corpus-based conceptual analysis has so far failed to develop, in part because of the absence of reliable methods to automatically detect concepts in textual data. Previous attempts have shown that topic models can constitute efficient concept detection heuristics, but while they leverage the syntagmatic relations in a corpus, they fail to exploit paradigmatic relations, and thus probably fail to model concepts accurately. In this article, we show that using a topic model that models concepts on a space of word embeddings (Hu and Tsujii, 2016) can lead to significant increases in concept detection performance, as well as enable the target concept to be expressed in more flexible ways using word vectors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.