Our current knowledge of scholarly plagiarism is largely based on the similarity between full text research articles. In this paper, we propose an innovative and novel conceptualization of scholarly plagiarism in the form of reuse of explicit citation sentences in scienti c research articles. Note that while full-text plagiarism is an indicator of a gross-level behavior, copying of citation sentences is a more nuanced micro-scale phenomenon observed even for well-known researchers.e current work poses several interesting questions and a empts to answer them by empirically investigating a large bibliographic text dataset from computer science containing millions of lines of citation sentences. In particular, we report evidences of massive copying behavior. We also present several striking real examples throughout the paper to showcase widespread adoption of this undesirable practice. In contrast to the popular perception, we nd that copying tendency increases as an author matures. e copying behavior is reported to exist in all elds of computer science; however, the theoretical elds indicate more copying than the applied elds.
CCS CONCEPTS•Information systems →Near-duplicate and plagiarism detection;
KEYWORDSCitation context, plagiarism, text reuse ACM Reference format: Mayank Singh, Abhishek Niranjan, Divyansh Gupta, Nikhil Angad Bakshi, Animesh Mukherjee, and Pawan Goyal. 2017. Citation sentence reuse behavior of scientists: A case study on massive bibliographic text dataset of computer science.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.