Finding better active learners for faster literature reviews

Yu, Zhe; Kraft, Nicholas A.; Menzies, Tim

doi:10.1007/s10664-017-9587-0

Cited by 65 publications

(105 citation statements)

References 53 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…All those aforementioned tactics are built into EMBLEM [114], [116], the active learner used for this work. When reading commits, EMBLEM initially uses uncertainty sampling to fast build a classification model (for bug-fixing or non bug-fixing commit message), then switches to certainty sampling to greedily find bug-fixing commits.…”

Section: Frameworkmentioning

confidence: 99%

“…values {N 1 = 4000, N 2 = 1, N 3 = 30, N 4 = 95%}). Those decisions where made by Yu et al [114] after exploring 32 different kinds of active learners. They report that, using the above requirements, EMBLEM found more relevant items faster than the previously reported state-of-the-art in incremental text mining retrieval [22], [102].…”

Section: Frameworkmentioning

confidence: 99%

“…Our pre-experimental belief was that EMBLEM would require extensive tuning before it could be used for labelling Github commits. However, the effectiveness of EMBLEM was obtained using Yu et al's original decisions [114], [116] without extensive tuning. Future improvements can be achieved by tuning different settings.…”

Section: Frameworkmentioning

confidence: 99%

“…Also, we could explore other control parameters for EMBLEM. All the above results were obtained using Yu et al's [114], [116] original requirements (e.g. the values of {N 1 = 4000, N 2 = 1, N 3 = 30, N 4 = 95%}) within the EMBLEM method.…”

Section: Future Workmentioning

confidence: 99%

See 3 more Smart Citations

Better Data Labelling With EMBLEM (and how that Impacts Defect Prediction)

2022

IIEEE Trans. Software Eng.

Self Cite

View full text Add to dashboard Cite

Standard automatic methods for recognizing problematic development commits can be greatly improved via the incremental application of human+artificial expertise. In this approach, called EMBLEM, an AI tool first explore the software development process to label commits that are most problematic. Humans then apply their expertise to check those labels (perhaps resulting in the AI updating the support vectors within their SVM learner). We recommend this human+AI partnership, for several reasons. When a new domain is encountered, EMBLEM can learn better ways to label which comments refer to real problems. Also, in studies with 9 open source software projects, labelling via EMBLEM's incremental application of human+AI is at least an order of magnitude cheaper than existing methods (≈ eight times). Further, EMBLEM is very effective. For the data sets explored here, EMBLEM better labelling methods significantly improved Popt20 and G-score performance in nearly all the projects studied here. TABLE 1This paper argues against using keywords like these as a method for labelling a commit as "buggy'.

show abstract

Section: Frameworkmentioning

confidence: 99%

Section: Frameworkmentioning

confidence: 99%

Section: Frameworkmentioning

confidence: 99%

Section: Future Workmentioning

confidence: 99%

See 2 more Smart Citations

Better Data Labelling With EMBLEM (and how that Impacts Defect Prediction)

2022

IIEEE Trans. Software Eng.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Given the information available in automated UI testing, we extract three types of features: • Text feature: the same text mining feature extraction used in the total recall approaches [56,57] Using the foregoing types of features, the proposed framework is described in Algorithm 1 with engineering choices of N 1 , N 2 . N 1 is the batch size of the process.…”

Section: Terminatormentioning

confidence: 99%

TERMINATOR: better automated UI test case prioritization

Fahid

Rothermel

et al. 2019

Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of

Self Cite

View full text Add to dashboard Cite

Automated UI testing is an important component of the continuous integration process of software development. A modern web-based UI is an amalgam of reports from dozens of microservices written by multiple teams. Queries on a page that opens up another will fail if any of that page's microservices fails. As a result, the overall cost for automated UI testing is high since the UI elements cannot be tested in isolation. For example, the entire automated UI testing suite at LexisNexis takes around 30 hours (3-5 hours on the cloud) to execute, which slows down the continuous integration process.To mitigate this problem and give developers faster feedback on their code, test case prioritization techniques are used to reorder the automated UI test cases so that more failures can be detected earlier. Given that much of the automated UI testing is "black box" in nature, very little information (only the test case descriptions and testing results) can be utilized to prioritize these automated UI test cases. Hence, this paper evaluates 17 "black box" test case prioritization approaches that do not rely on source code information. Among these, we propose a novel TCP approach, that dynamically re-prioritizes the test cases when new failures are detected, by applying and adapting a state of the art framework from the total recall problem. Experimental results on LexisNexis automated UI testing data show that our new approach (which we call TERMINATOR), outperformed prior state of the art approaches in terms of failure detection rates with negligible CPU overhead. CCS CONCEPTS• Software and its engineering → Software testing and debugging; • Information systems → Learning to rank.

show abstract

A Semi-automatic Document Screening System for Computer Science Systematic Reviews

Hannousse

Yahiouche

2022

Pattern Recognition and Artificial Intelligence

View full text Add to dashboard Cite

Finding better active learners for faster literature reviews

Cited by 65 publications

References 53 publications

Better Data Labelling With EMBLEM (and how that Impacts Defect Prediction)

Better Data Labelling With EMBLEM (and how that Impacts Defect Prediction)

TERMINATOR: better automated UI test case prioritization

A Semi-automatic Document Screening System for Computer Science Systematic Reviews

Contact Info

Product

Resources

About