Graphical abstract
COVIDSum (
COVID
-19 scientific paper
Sum
marization) consists of four major modules: (1) Dataset Preprocessing, (2) Heuristic Sentence Extraction, (3) Word Cooccurrence Graph Construction, and (4) Linguistically Enriched Abstractive Summarization.
The Data Preprocessing module
retrieves abstract and textual content of each paper and removes papers which have missed abstracts or are not written in English language.
Sentence Extraction module
applies three heuristic methods to extract sentences of each paper. Word Co-occurrence Relationship Graph Construction module extracts word co-occurrence relationship to construct an un-weighted directed word co-occurrence graph.
Linguistically Enriched Abstractive Summarization
module proposes a hybrid summarization approach, which utilizes SciBERT and a GATbased graph encoder to encode the word sequences and word co-occurrence graphs respectively, adopts highway networks to fuse the above two encodings for obtaining context vectors of sentences, and applies Transformer decoder to generate summaries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.