We present the SFU Opinion and Comments Corpus (SOCC), a collection of opinion articles and the comments posted in response to the articles. The articles include all the opinion pieces published in the Canadian newspaper The Globe and Mail in the 5-year period between 2012 and 2016, a total of 10,339 articles and 663,173 comments. SOCC is part of a project that investigates the linguistic characteristics of online comments. The corpus can be used to study a host of pragmatic phenomena. Among other aspects, researchers can explore: the connections between articles and comments; the connections of comments to each other; the types of topics discussed in comments; the nice (constructive) or mean (toxic) ways in which commenters respond to each other; how language is used to convey very specific types of evaluation; and how negation affects the interpretation of evaluative meaning in discourse. Our current focus is the study of constructiveness and evaluation in the comments. To that end, we have annotated a subset of the large corpus (1043 comments) with four layers of annotations: constructiveness, toxicity, negation and Appraisal (Martin and White, The language of evaluation, Palgrave, New York, 2005). This paper details our corpus, the data collection process, the characteristics of the corpus and describes the annotations. While our focus is comments posted in response to opinion news articles, the phenomena in this corpus are likely to be present in many commenting platforms: other news comments, comments and replies in fora such as Reddit, feedback on blogs, or YouTube comments.
We present detailed analyses of the distribution of Appraisal categories (Martin & White, 2005) in a corpus of online news comments. The corpus consists of just over one thousand comments posted in response to a variety of opinion pieces on the website of the Canadian English-language newspaper The Globe and Mail. We annotated all the comments with labels corresponding to different categories of the Appraisal framework. Analyses of the annotations show that comments are overwhelmingly negative and that they favour two of the subtypes of Attitude, Judgement and Appreciation. The paper contributes a methodology for annotating Appraisal, examining the interaction of Appraisal with negation, the constructive nature of comments, and the level of toxicity found in them. The results show that highly opinionated language is expressed as opinion (Judgement and Appreciation) rather than as an emotional reaction (Affect). This finding, together with the interplay of evaluative language with constructiveness and toxicity in the comments, can be applied to the automatic moderation of online comments.
Previous perceptual studies of English stop voicing focus on Voice-Onset Time (VOT). Aspiration is generally subsumed into VOT, yet [1] complicates this, evincing a trading relation between intensity of aspiration noise and VOT. Our study is the first to examine the role of VOT and aspiration in English listeners’ perception of non-native plain and murmured stops. We recorded Marathi talkers producing /CVsV/ nonce words beginning with /t/, /tʰ/, /d/, /dʰ/, e.g. /dʰaːsaː/, or their velar counterparts. The following acoustic measures were taken for each token: — Duration of prevoicing — After Closure Time (ACT) [2], i.e., the interval between release and periodicity — Pre-Vocalic Interval (PVI) [3], which includes ACT and the murmured portion of the vowel — Intensity of aspiration noise Canadian English listeners rated the stops on a 6-point scale from voiced to voiceless. Only results for murmured stops are discussed here. Despite variability of prevoicing duration in these tokens (range = 200ms, sd = 43), the factor did not correlate significantly with a token’s average rating (p>0.60). However, ACT (r=.53, p<0.001), PVI (r=0.39, p<0.001), and the mean intensity of aspiration noise (r = 0.58, p<0.02) did. Thus, aspiration, not prevoicing, best accounts for perceptual differences between murmured stops. [1] Repp, “Relative amplitude of aspiration noise…,” Lang. Speech 22, 1979; [2] Mikuteit & Reetz, “Caught in the ACT…,” Lang. Speech 50, 2007; [3] Berkson, “Capturing breathy voice…,” Kans. Work. Pap. Ling. 33, 2012.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.