Sentiment and politeness analysis tools on developer discussions are unreliable, but so are people

Imtiaz, Nasif; Middleton, Justin; Girouard, Peter; Murphy-Hill, Emerson

doi:10.1145/3194932.3194938

Cited by 24 publications

(16 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, a possible explanation for the low agreement observed is that the benchmarked tools have been originally validated and tuned on gold standards that include manual annotation following different guidelines. As already pointed out by previous research, sentiment annotation is a subjective task, thus even humans might disagree with each-others (Imtiaz et al 2018) if model-driven annotation is not adopted (Novielli et al 2018b). Moreover, Islam and Zibran (2018a) showed how tools exhibit their best performance on the dataset they were originally tested at the time of their release, whereas a drop in performance is observed when they are assessed on a different dataset.…”

Section: Sentiment Analysis Tools Should Be Retrained If Possible Rather Than Used Off the Shelfmentioning

confidence: 97%

Assessment of off-the-shelf SE-specific sentiment analysis tools: An extended replication study

et al. 2021

View full text Add to dashboard Cite

Sentiment analysis methods have become popular for investigating human communication, including discussions related to software projects. Since general-purpose sentiment analysis tools do not fit well with the information exchanged by software developers, new tools, specific for software engineering (SE), have been developed. We investigate to what extent off-the-shelf SE-specific tools for sentiment analysis mitigate the threats to conclusion validity of empirical studies in software engineering, highlighted by previous research. First, we replicate two studies addressing the role of sentiment in security discussions on GitHub and in question-writing on Stack Overflow. Then, we extend the previous studies by assessing to what extent the tools agree with each other and with the manual annotation on a gold standard of 600 documents. We find that different SE-specific sentiment analysis tools might lead to contradictory results at a fine-grain level, when used off-the-shelf. Conversely, platform-specific tuning or retraining might be needed to take into account differences in platform conventions, jargon, or document lengths.

show abstract

Section: Sentiment Analysis Tools Should Be Retrained If Possible Rather Than Used Off the Shelfmentioning

confidence: 97%

Assessment of off-the-shelf SE-specific sentiment analysis tools: An extended replication study

et al. 2021

View full text Add to dashboard Cite

show abstract

“…Of the 𝑛 = 80 papers, 26 mentioned the problem of the lack or scarcity of adaptations of existing sentiment analysis tools to the domain of SE (e.g. [3,8,10,15,27]). However, other problems like sarcasm/irony handling (11) or the subjectivity of manual labeling of data (10) were also mentioned.…”

Section: Difficultiesmentioning

confidence: 99%

“…The authors often stated that existing, domain independent tools lead to poor results in the SE domain (e.g. [8,27,42]). This is because certain terms are used differently in the SE domain than in the nontechnical context, resulting in different sentiments.…”

Section: Rqmentioning

confidence: 99%

Development and Application of Sentiment Analysis Tools in Software Engineering: A Systematic Literature Review

Obaidi

Klünder

2021

Evaluation and Assessment in Software Engineering

View full text Add to dashboard Cite

Software development is a collaborative task and, hence, involves different persons. Research has shown the relevance of social aspects in the development team for a successful and satisfying project closure. Especially the mood of a team has been proven to be of particular importance. Thus, project managers or project leaders want to be aware of situations in which negative mood is present to allow for interventions. So-called sentiment analysis tools offer a way to determine the mood based on text-based communication. In this paper, we present the results of a systematic literature review of sentiment analysis tools developed for or applied in the context of software engineering. Our results summarize insights from 80 papers with respect to (1) the application domain, (2) the purpose, (3) the used data sets, (4) the approaches for developing sentiment analysis tools and ( 5) the difficulties researchers face when applying sentiment analysis in the context of software projects. According to our results, sentiment analysis is frequently applied to open-source software projects, and most tools are based on support-vector machines. Despite the frequent use of sentiment analysis in software engineering, there are open issues, e.g., regarding the identification of irony or sarcasm, pointing to future research directions. CCS CONCEPTS• Software and its engineering → Collaboration in software development; • Human-centered computing → Collaborative and social computing systems and tools.

show abstract

“…[24,25,27,30]. In consideration of sentiment analysis for software engineering domain, [8,9,11], mainly focus on deriving developers' opinions/emotions, along with the context. It is observed that the existing tools make dataset driven predictions, where each prediction conflicts with one another [21].…”

Section: Related Workmentioning

confidence: 99%

BERT-Based Sentiment Analysis: A Software Engineering Perspective

Batra,

Punn,

Sonbhadra

et al. 2021

Preprint

View full text Add to dashboard Cite

Sentiment analysis can provide a suitable lead for the tools used in software engineering along with the API recommendation systems and relevant libraries to be used. In this context, the existing tools like SentiCR, SentiStrength-SE, etc. exhibited low f1-scores that completely defeats the purpose of deployment of such strategies, thereby there is enough scope for performance improvement. Recent advancements show that transformer based pre-trained models (e.g., BERT, RoBERTa, ALBERT, etc.) have displayed better results in the text classification task. Following this context, the present research explores different BERT-based models to analyze the sentences in GitHub comments, Jira comments, and Stack Overflow posts. The paper presents three different strategies to analyse BERT based model for sentiment analysis, where in the first strategy the BERT based pre-trained models are finetuned; in the second strategy an ensemble model is developed from BERT variants, and in the third strategy a compressed model (Distil BERT) is used. The experimental results show that the BERT based ensemble approach and the compressed BERT model attain improvements by 6-12% over prevailing tools for the F1 measure on all three datasets.

show abstract

Sentiment and politeness analysis tools on developer discussions are unreliable, but so are people

Cited by 24 publications

References 34 publications

Assessment of off-the-shelf SE-specific sentiment analysis tools: An extended replication study

Assessment of off-the-shelf SE-specific sentiment analysis tools: An extended replication study

Development and Application of Sentiment Analysis Tools in Software Engineering: A Systematic Literature Review

BERT-Based Sentiment Analysis: A Software Engineering Perspective

Contact Info

Product

Resources

About