2017
DOI: 10.1007/s00799-017-0211-0
|View full text |Cite
|
Sign up to set email alerts
|

Reuse and plagiarism in Speech and Natural Language Processing publications

Abstract: The goal of this paper is to propose measures of innovation through the study of publications in the field of speech and language processing. It is based on the NLP4NLP corpus, which contains the articles published in major conferences and journals related to speech and language processing over 50 years . It represents 65,003 documents from 34 different sources, conferences and journals, published by 48,894 different authors in 558 events, for a total of more than 270 million words and 324,422 bibliographical … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(12 citation statements)
references
References 17 publications
0
12
0
Order By: Relevance
“…Here we study the reuse of NLP4NLP papers in other NLP4NLP papers (Mariani et al, 2016(Mariani et al, , 2017a.…”
Section: Text Reuse and Plagiarismmentioning
confidence: 99%
See 1 more Smart Citation
“…Here we study the reuse of NLP4NLP papers in other NLP4NLP papers (Mariani et al, 2016(Mariani et al, , 2017a.…”
Section: Text Reuse and Plagiarismmentioning
confidence: 99%
“…We then studied when and who introduced new terms, as a mark of the innovative ability of various authors, which may also provide an estimate of their contribution to the advances of the scientific domain (Mariani et al, 2018a). We make the hypothesis that an innovation is induced by the introduction of a term which was previously unused in the community and then became popular.…”
Section: Innovation New Terms Introduced By the Authorsmentioning
confidence: 99%
“…2 This special issue resulting from the BIRNDL workshop includes 14 papers: four extended papers presented at the first BIRNDL workshop and the BIR workshop at ECIR 2016 [2,8,14,18], three extended system reports of the CL-SciSumm Shared Task 2016 [1,13,16] and one overview paper [11] and six original research papers submitted via the open call for papers [6,7,9,10,12,17]. guage Processing publications" [14] to detect extrinsic instances of self-reuse, self-plagiarism, reuse and plagiarism in NLP and speech processing articles. By comparing word sequences in publications from the NLP4NLP corpus, consisting of more than 65,000 documents, the authors found that self-reuse (i.e., the reuse of text from another document on which there is a common author and that is cited) is relatively common, but reuse of content from others papers that have been cited and plagiarism, where there is no attribution, were quite rare and remained within ethical limits.…”
Section: Special Issue Papersmentioning
confidence: 99%
“…Alzahrani et al, [53] utilized citation evidences along with structural detection for detecting plagiarism cases. Four types of plagiarism, viz., self-reuse, self-plagiarism, reuse and plagiarism is detected using text based detection with citation analysis to detect copy-paste plagiarism in scientific articles of NLP domain by Mariani et al [57]. They used papers from different websites such as ACL Anthology, ISCA archive and IEEE in NLP and speech processing.…”
Section: Citation Based Detection Techniquementioning
confidence: 99%