Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence 2022
DOI: 10.24963/ijcai.2022/591
|View full text |Cite
|
Sign up to set email alerts
|

Generating a Structured Summary of Numerous Academic Papers: Dataset and Method

Abstract: As time goes by, language evolves with word semantics changing. Unfortunately, traditional word embedding methods neglect the evolution of language and assume that word representations are static. Although contextualized word embedding models can capture the diverse representations of polysemous words, they ignore temporal information as well. To tackle the aforementioned challenges, we propose a graph-based dynamic word embedding (GDWE) model, which focuses on capturing the semantic drift of words continually… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 2 publications
0
6
0
Order By: Relevance
“…Dataset Multi-Xscience is proposed by Lu et al (2020), which focuses on writing the related work section of a paper based on its abstract with 4.4 articles cited in average. Dataset BigSurvey-MDS is the first large-scale multi-document scientific summarization dataset using review papers' introduction section as target (LIU et al, 2022), where previous work usually takes the section of related work as the target. Both BigSurvey and our HiCat-GLR task have more than 70 references, resulting in over 10,000 words of input, while their output is still the scale of a standard text paragraph, similar to Multi-Xscience.…”
Section: Dataset Statistics and Analysismentioning
confidence: 99%
See 2 more Smart Citations
“…Dataset Multi-Xscience is proposed by Lu et al (2020), which focuses on writing the related work section of a paper based on its abstract with 4.4 articles cited in average. Dataset BigSurvey-MDS is the first large-scale multi-document scientific summarization dataset using review papers' introduction section as target (LIU et al, 2022), where previous work usually takes the section of related work as the target. Both BigSurvey and our HiCat-GLR task have more than 70 references, resulting in over 10,000 words of input, while their output is still the scale of a standard text paragraph, similar to Multi-Xscience.…”
Section: Dataset Statistics and Analysismentioning
confidence: 99%
“…It allows users to compose a template tree that consists of two types of nodes, dimension node and topic node. Shuaiqi et al (2022) trains classifiers based on BERT (Devlin et al, 2018) to conduct category-based alignment, where each sentence from academic papers is annotated into five categories: background, objective, method, result, and other. Next, each research topic's sentences are summarized and concatenated together.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Expect for the widely studied news summarization (Grusky et al, 2018;Fabbri et al, 2019), summarizing long documents received more attention in recent years. There are some datasets collected from different domains, including scientific literature (Cohan et al, 2018;Liu et al, 2022a), government reports (Huang et al, 2021), and books (Kryściński et al, 2021). The Financial Narrative Summarisation shared task in 2020 (El-Haj et al, 2020) delivered an annual report dataset from firms listed on the London Stock Exchange.…”
Section: Automatic Document Summarizationmentioning
confidence: 99%
“…To model longer input sequences with limited GPU memory, Huang et al (2021) compare various efficient attention mechanisms for the encoder and propose an encoder-decoder attention named Hepos. Liu et al (2022a) identify and encode salient content in different aspects from diverse and long input content by category-based alignment and sparse attention mechanisms. Zhang et al (2022) divide the summarization process into multiple stages and keep segmenting, summarizing, and concatenating long inputs till they are compressed to a fixed length.…”
Section: Automatic Document Summarizationmentioning
confidence: 99%