2020
DOI: 10.48550/arxiv.2008.00774
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Elsevier OA CC-By Corpus

Abstract: We introduce the Elsevier OA CC-BY corpus. This is the first open corpus of Scientific Research papers which has a representative sample from across scientific disciplines. This corpus not only includes the full text of the article, but also the metadata of the documents, along with the bibliographic information for each reference.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 9 publications
(13 reference statements)
0
1
0
Order By: Relevance
“…Using Dialogizer, we generate four ConvQA datasets for use in experiments. These datasets are developed by leveraging four source-text datasets from diverse domains: Wikipedia, PubMed, CC-News (Hamborg et al, 2017), and Elsevier OA CC-By (Kershaw and Koeling, 2020). Each dataset is named after its corresponding source dataset, namely WikiDialog2, PubmedDialog, CC-newsDialog, and ElsevierDialog.…”
Section: Generated Datasetsmentioning
confidence: 99%
“…Using Dialogizer, we generate four ConvQA datasets for use in experiments. These datasets are developed by leveraging four source-text datasets from diverse domains: Wikipedia, PubMed, CC-News (Hamborg et al, 2017), and Elsevier OA CC-By (Kershaw and Koeling, 2020). Each dataset is named after its corresponding source dataset, namely WikiDialog2, PubmedDialog, CC-newsDialog, and ElsevierDialog.…”
Section: Generated Datasetsmentioning
confidence: 99%