2015 IEEE/ACM 12th Working Conference on Mining Software Repositories 2015
DOI: 10.1109/msr.2015.54
|View full text |Cite
|
Sign up to set email alerts
|

Mining StackOverflow to Filter Out Off-Topic IRC Discussion

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 34 publications
(15 citation statements)
references
References 7 publications
0
15
0
Order By: Relevance
“…Moreover, we find that the query redundancy of TUV is relatively high. This suggests that it is not appropriate to extract the software repositories via utilizing their social features of “upvotes” or “pageviews” in recent work . In addition, MR has medium query redundancy on 20 000 samples, which may be the changing trend of the curve is not apparent.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Moreover, we find that the query redundancy of TUV is relatively high. This suggests that it is not appropriate to extract the software repositories via utilizing their social features of “upvotes” or “pageviews” in recent work . In addition, MR has medium query redundancy on 20 000 samples, which may be the changing trend of the curve is not apparent.…”
Section: Methodsmentioning
confidence: 99%
“…For example, Ponzanelli et al created the models for retrieving pertinent discussions from StackOverflow to turn the integrated development environment into a self‐confident programming prompter. Chowdhury and Hindle trained a classifier to filter out off‐topic messages of StackOverflow for Internet Relay Chat discussion. Chen et al proposed a programming language independent method for code pattern recognition based on code patterns extracted from Stack Overflow.…”
Section: Related Workmentioning
confidence: 99%
“…Honsel et al [55] evaluated common myths about SO posts to discover whether the perception of software developers is accurate. Chowdhury and Hindle [56] proposed an approach to classify off-topic SO posts in programming-related internet relay chat channels. Li et al [57] empirically investigated to identify the diverse needs and problems software developers face.…”
Section: Martinez and Lecomtementioning
confidence: 99%
“…Classifying chat messages into two classes, as in our experiment, is known as binary classification. We compared the performance of two learning algorithms, Naive Bayes Multinomial and Support Vector Machine (SVM) due to their popularity and good performance for text classification [11], [18], [39]. a) Preprocessing: Before training the classifiers, we preprocessed the message text by converting it into tokens and lowercase.…”
Section: A Binary Classificationmentioning
confidence: 99%
“…In their work, they mined IRC meeting logs to investigate the meeting content, meeting participants, their contribution and communication styles. Chowdhury and Hindle [11] implemented machine learning approaches to filter out off-topic discussions in programming IRC channels by exploiting StackOverflow programming discussions and YouTube video comments. Yu et al [45] investigated the use of synchronous (IRC) and asynchronous (mailing list) communication mechanisms in global software development projects.…”
Section: B Mining Developers' Communication Artifactsmentioning
confidence: 99%