Who Judges the Judge: An Empirical Study on Online Judge Tests

Liu, K.; Yan, Han; Zhang, Jie M.; Chen, Zhenpeng; Sarro, Federica; Harman, Mark; Huang, Gang; Ma, Yun

doi:10.1145/3597926.3598060

Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis 2023

DOI: 10.1145/3597926.3598060

|View full text |Cite

Who Judges the Judge: An Empirical Study on Online Judge Tests

K. Liu

Han Yan

Jie M. Zhang

et al.

Abstract: Online Judge platforms play a pivotal role in education, competitive programming, recruitment, career training, and large language model training. They rely on predefined test suites to judge the correctness of submitted solutions. It is therefore important that the solution judgement is reliable and free from potentially misleading false positives (i.e., incorrect solutions that are judged as correct). In this paper, we conduct an empirical study of 939 coding problems with 541,552 solutions, all of which are… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2023

2024

Publication Types

Select...

Other2

Relationship

Self Cite0

Independent2

Authors

Journals

Cited by 2 publications

References 57 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Large Language Models for Software Engineering: Survey and Open Problems

Fan,

Gokkaya,

Harman

et al. 2023

2023 IEEE/ACM International Conference on Software Engineering: Future of Software Engineering (ICSE-FoSE)

View full text Add to dashboard Cite

Large Language Models for Software Engineering: Survey and Open Problems

Fan,

Gokkaya,

Harman

et al. 2023

2023 IEEE/ACM International Conference on Software Engineering: Future of Software Engineering (ICSE-FoSE)

View full text Add to dashboard Cite

TrickyBugs: A Dataset of Corner-case Bugs in Plausible Programs

Liu,

Han,

Liu

et al. 2024

Proceedings of the 21st International Conference on Mining Software Repositories

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Who Judges the Judge: An Empirical Study on Online Judge Tests

Cited by 2 publications

References 57 publications

Large Language Models for Software Engineering: Survey and Open Problems

Large Language Models for Software Engineering: Survey and Open Problems

TrickyBugs: A Dataset of Corner-case Bugs in Plausible Programs

Contact Info

Product

Resources

About