Proceedings of the ACM Symposium on Document Engineering 2020 2020
DOI: 10.1145/3395027.3419591
|View full text |Cite
|
Sign up to set email alerts
|

COVID-19 Kaggle Literature Organization

Abstract: The world has faced the devastating outbreak of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2), or COVID-19, in 2020. Research in the subject matter was fast-tracked to such a point that scientists were struggling to keep up with new findings. With this increase in the scientific literature, there arose a need for organizing those documents. We describe an approach to organize and visualize the scientific literature on or related to COVID-19 using machine learning techniques so that papers on sim… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
18
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(19 citation statements)
references
References 22 publications
1
18
0
Order By: Relevance
“…In recent work, multiple groups such as [14] [15] and [16] have evaluated their models over the COVID-19 Open Research Dataset (CORD-19) [9] to help bridge the gap between researchers and the rapid growth of journal publications. In [14], authors test the efficacy of a graph-based clustering model and Bio-BERT [15] word embeddings approach for information retrieval through a Question-Answer bot related to clinical queries.…”
Section: Related Work a Automation Of Literature Reviewmentioning
confidence: 99%
See 2 more Smart Citations
“…In recent work, multiple groups such as [14] [15] and [16] have evaluated their models over the COVID-19 Open Research Dataset (CORD-19) [9] to help bridge the gap between researchers and the rapid growth of journal publications. In [14], authors test the efficacy of a graph-based clustering model and Bio-BERT [15] word embeddings approach for information retrieval through a Question-Answer bot related to clinical queries.…”
Section: Related Work a Automation Of Literature Reviewmentioning
confidence: 99%
“…The efforts from the Kaggle COVID-19 Literature Organization [9] to develop the CORD-19 dataset aim to resolve the limitations that scientists face when handling the evergrowing COVID-19 literature. Kaggle has coordinated with prestigious institutions and research companies to facilitate data analytics challenges.…”
Section: Expertise Sharing and Crowdsourcingmentioning
confidence: 99%
See 1 more Smart Citation
“…CORD-19 is a collection of over 400,000 scholarly articles about COVID-19 and related diseases. In our previous work on this dataset, we showed that investigation of the CORD-19 corpus can be simplified through clustering and dimensionality reduction using T-SNE, PCA, and k-Means [7]. The Kaggle notebook from our prior research has attracted great interest in the data science community 2 .…”
Section: Introductionmentioning
confidence: 98%
“…Most of the pre-processing steps presented in Section 3.1 improve upon the data cleaning methods from our prior work on the same dataset [7]. Then, Section 3.2 describes the details of tensor construction and analysis of the latent factors.…”
Section: Introductionmentioning
confidence: 99%