Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) 2020
DOI: 10.1145/3382494.3410680
|View full text |Cite
|
Sign up to set email alerts
|

A large-scale comparative analysis of Coding Standard conformance in Open-Source Data Science projects

Abstract: Background: Meeting the growing industry demand for Data Science requires cross-disciplinary teams that can translate machine learning research into production-ready code. Software engineering teams value adherence to coding standards as an indication of code readability, maintainability, and developer expertise. However, there are no large-scale empirical studies of coding standards focused specifically on Data Science projects. Aims: This study investigates the extent to which Data Science projects follow co… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
21
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 19 publications
(25 citation statements)
references
References 23 publications
2
21
0
Order By: Relevance
“…This step is also performed by our analysis tool and concerns running the static code analysis tool Pylint (version 2.6.0) in its default configuration on all pure Python files in each project (but not on any of the dependencies). We choose Pylint for static code analysis as it is widely used and widely accepted in the Python community, as well as being highly configurable [6,10]. It is also well integrated into IDEs such as PyCharm and VS Code.…”
Section: Static Analysismentioning
confidence: 99%
See 4 more Smart Citations
“…This step is also performed by our analysis tool and concerns running the static code analysis tool Pylint (version 2.6.0) in its default configuration on all pure Python files in each project (but not on any of the dependencies). We choose Pylint for static code analysis as it is widely used and widely accepted in the Python community, as well as being highly configurable [6,10]. It is also well integrated into IDEs such as PyCharm and VS Code.…”
Section: Static Analysismentioning
confidence: 99%
“…Our study differs from [6] in that we do not compare against non-DS projects and in that we do not solely focus on the adherence to coding standards as [6] does. Our primary focus lies more on investigating obstructions to the maintainability and reproducibility of ML projects, which includes coding standards violations, but also entails recognising refactoring opportunities and other code smells [7].…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations