Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02 2001
DOI: 10.3115/1073083.1073159
|View full text |Cite
|
Sign up to set email alerts
|

A noisy-channel model for document compression

Abstract: We present a document compression system that uses a hierarchical noisy-channel model of text production. Our compression system first automatically derives the syntactic structure of each sentence and the overall discourse structure of the text given as input. The system then uses a statistical hierarchical model of text production in order to drop non-important syntactic and discourse constituents so as to generate coherent, grammatical document compressions of arbitrary length. The system outperforms both a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2007
2007
2022
2022

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 47 publications
(31 citation statements)
references
References 12 publications
0
31
0
Order By: Relevance
“…Teufel and Moens (1997) and Lin and Hovy (2002) describe representative examples. Recent work in sentence compression (Knight and Marcu 2002;McDonald 2006) and document compression (Daumé III and Marcu 2002) attempts to take small steps beyond sentence extraction. Compression models can be seen as techniques for extracting sentences then dropping extraneous information.…”
Section: Automatic Document Summarizationmentioning
confidence: 99%
“…Teufel and Moens (1997) and Lin and Hovy (2002) describe representative examples. Recent work in sentence compression (Knight and Marcu 2002;McDonald 2006) and document compression (Daumé III and Marcu 2002) attempts to take small steps beyond sentence extraction. Compression models can be seen as techniques for extracting sentences then dropping extraneous information.…”
Section: Automatic Document Summarizationmentioning
confidence: 99%
“…This theory guided the annotation of the RST Discourse Treebank (RST-DT) for English, from which several textlevel discourse parsers have been induced (Hernault et al, 2010;Joty et al, 2012;Feng and Hirst, 2014;Li et al, 2014;Ji and Eisenstein, 2014). Such parsers have proven to be useful for various downstream applications (Daumé III and Marcu, 2009;Burstein et al, 2003;Higgins et al, 2004;Thione et al, 2004;Sporleder and Lapata, 2005;Taboada and Mann, 2006;Louis et al, 2010;Bhatia et al, 2015).…”
Section: Introductionmentioning
confidence: 99%
“…Many studies that have utilized RST have simply adopted EDUs as textual units (Mann and Thompson, 1988;Daumé III and Marcu, 2002;Hirao et al, 2013;Knight and Marcu, 2000). While EDUs are textual units for RST, they are too fine grained as textual units for methods of extractive summarization.…”
Section: Fragmentation Of Informationmentioning
confidence: 99%
“…None of these methods use the discourse structures of documents. Daumé III and Marcu (2002) proposed a noisychannel model that used RST. Although their method generated a well-organized summary, no optimality of information coverage was guaranteed and their method could not accept large texts because of the high computational cost.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation