Sociology has been described as a ‘third culture’ between science and literature. The distinctions between different orientations in sociological writing have been studied primarily through their non-textual manifestations (publication genres or venues, methodologies used, scientometric indicators, etc.). Our knowledge of how the science–literature boundary relates to the rhetorical composition of sociological texts therefore remains limited. We mixed a bespoke corpus of Czech sociological articles with a corpus of Czech short fiction to straightforwardly account for the relationship between sociology and literature. Unsupervised classification based on the distribution of most frequent verbs yielded two categories of sociological articles. Each cluster exhibited significant association with non-textual variables. Articles less similar to literature were associated with higher rates of co-authorship, citation counts, and number of women as first authors. Both clusters also displayed clear semantic differences. The signal from literary works increased variance in the textual feature space and subsequent pseudo-experimental validation confirmed its indispensability for the discovery of the association between the rhetorical pattern of verbs usage and non-textual variables related to sociological articles.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.