Vibhu O. Mittal scite author profile

Extractive summarization techniques cannot generate document summaries shorter than a single sentence, something that is often required. An ideal summarization system would understand each document and generate an appropriate summary directly from the results of that understanding. A more practical approach to this problem results in the use of an approximation: viewing summarization as a problem analogous to statistical machine translation. The issue then becomes one of generating a target document in a more concise language from a source document in a more verbose language. This paper presents results on experiments using this approach, in which statistical models of the term selection and term ordering are jointly applied to produce summaries in a style learned from a training corpus.

show abstract

Multi-document summarization by sentence extraction

Goldstein

Mittal²,

Carbonell

et al. 2000

217

126

View full text Add to dashboard Cite

This paper discusses a text extraction approach to multidocument summarization that builds on single-document summarization methods by using additional, available information on about the document set as a whole and the relationships between the documents. Multi-document summarization differs from single in that the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries. Our approach addresses these issues by using domainindependent techniques based mainly on fast, statistical processing, a metric for reducing redundancy and maximizing diversity in the selected passages, and a modular framework to allow easy parameterization for different genres, corpora characteristics and user requirements.

show abstract

Multi-document summarization by sentence extraction

Goldstein

Mittal²,

Carbonell

et al. 2000

View full text Add to dashboard Cite

This paper discusses a text extraction approach to multidocument summarization that builds on single-document summarization methods by using additional, available in-, formation about the document set as a whole and the relationships between the documents. Multi-document summarization differs from single in that the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries. Our approach addresses these issues by using domainindependent techniques based mainly on fast, statistical processing, a metric for reducing redundancy and maximizing diversity in the selected passages, and a modular framework to allow easy parameterization for different genres, corpora characteristics and user requirements.

show abstract

Creating and evaluating multi-document sentence extract summaries

Goldstein

Mittal

Carbonell

et al. 2000

View full text Add to dashboard Cite

This paper discusses passage extraction approaches to multidocument summarization that use available information about the document set as a whole and the relationships between the documents to build on single document summarization methodology. Multi-document summarization di ers from single in that the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries, as well as the user's goals in creating the summary. Our approach addresses these issues by using domain-independent techniques based mainly on fast, statistical processing, a metric for reducing redundancy and maximizing diversity in the selected passages, and a modular framework to allow easy parameterization for di erent genres, corpora characteristics and user requirements. We examined how h umans create multi-document summaries as well as the characteristics of such summaries and use these summaries to evaluate the performance of various multidocument summarization algorithms.

show abstract

Applying Machine Learning for High‐Performance Named‐Entity Extraction

Baluja

Mittal

Sukthankar

2000

Computational Intelligence

View full text Add to dashboard Cite

This paper describes a machine learning approach to building an efficient and accurate name spotting system. Finding names in free text is an important task in many text-based applications. Most previous approaches were based on hand-crafted modules encoding language and genre-specific knowledge. These approaches had at least two shortcomings: they required large amounts of time and expertise to develop and were not easily portable to new languages and genres. This paper describes an extensible system that automatically combines weak evidence from different, easily available sources: parts-of-speech tags, dictionaries, and surface-level syntactic information such as capitalization and punctuation. Individually, each piece of evidence is insufficient for robust name detection. However, the combination of evidence, through standard machine learning techniques, yields a system that achieves performance equivalent to the best existing hand-crafted approaches.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Vibhu O. Mittal

Headline generation based on statistical translation

Multi-document summarization by sentence extraction

Multi-document summarization by sentence extraction

Creating and evaluating multi-document sentence extract summaries

Applying Machine Learning for High‐Performance Named‐Entity Extraction

Contact Info

Product

Resources

About