2006
DOI: 10.2172/894745
|View full text |Cite
|
Sign up to set email alerts
|

QCS: a system for querying, clustering and summarizing documents.

Abstract: Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system-the Query, Cluster, Summarize (QCS) system-which is portable, modular, and permits experimentation with different instantiations of each of the constituent text

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0
2

Year Published

2009
2009
2020
2020

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(23 citation statements)
references
References 4 publications
0
21
0
2
Order By: Relevance
“…In this section, we compare the summary performances of MOGAs with those of other five methods, such as CRF (Shen et al, 2007), Manifold-Ranking (Wan et al, 2007), NetSum (Svore et al, 2007), QCS (Dunlavy et al, 2007), and SVM (Yeh et al, 2004) which are widely used in the automatic document summarization. Table 1 and Table 2 show the results of all the methods in terms F-measure, ROUGE-1 and ROUGE-2 metrics on DUC01 and DUC02 datasets, respectively.…”
Section: Performance and Discussionmentioning
confidence: 99%
“…In this section, we compare the summary performances of MOGAs with those of other five methods, such as CRF (Shen et al, 2007), Manifold-Ranking (Wan et al, 2007), NetSum (Svore et al, 2007), QCS (Dunlavy et al, 2007), and SVM (Yeh et al, 2004) which are widely used in the automatic document summarization. Table 1 and Table 2 show the results of all the methods in terms F-measure, ROUGE-1 and ROUGE-2 metrics on DUC01 and DUC02 datasets, respectively.…”
Section: Performance and Discussionmentioning
confidence: 99%
“…In this section, the performance of our method is compared with other well‐known or recently proposed methods. Comparison of the proposed method was made against the following methods: (a) DPSO‐EDASum (optimization approach based on discrete PSO and EDA; Alguliev, Aliguliyev, & Mehdiyev, ), (b) LexRank (graph‐based approach; Erkan & Radev, ), (c) CollabSum (clustering and graph‐ranking based approach; Wan et al, ), (d) UnifiedRank (graph‐based approach; Wan, ), (e) 0–1 non‐linear (binary optimization based on discrete PSO approach; Alguliev, Aliguliyev, & Isazade, ), (f) QCS (machine learning approach based on hidden Markov model; Dunlavy et al, ), (g) SVM (algebraic approach; Yeh et al, ), (h) FEOM (fuzzy evolutionary approach; Song et al, ), (i) CRF (machine learning approach based on CRF; Shen et al, ), (j) MA‐SingleDocSum (metaheuristic approach based on genetic operators and guided local search; Mendoza et al, ), (k) NetSum (machine learning approach based on neural nets; Svore et al, ), (l) manifold ranking (probabilistic approach using greedy algorithm; Wan et al, ), (m) ESDS‐GHS‐GLO (binary optimization based on the global‐best harmony search heuristic, a greedy local search algorithm; Mendoza et al, ), and (n) DE (clustering and metaheuristic based approach; Aliguliyev, ). These methods have been chosen for comparison because they have achieved the best results on the DUC2001 and DUC2002 data sets.…”
Section: Methodsmentioning
confidence: 99%
“…Dunlavy et al 7 rely on a Hidden Markov Model (HMM) to create the summary of a document, which consists of the top-N sentences with the highest probability values of features computed using the HMM. The features used in the HMM include (i) the number of signature terms in a sentence, i.e., terms that are more likely to occur in a given document rather than in the collection to which the document belongs, (ii) the number of subject terms, i.e., signature terms that occur in headline or subject leading sentences, and (iii) the position of the sentence in the document.…”
Section: Comparing the Performance Of Corsum(-sf )mentioning
confidence: 99%
“…The features used in the HMM include (i) the number of signature terms in a sentence, i.e., terms that are more likely to occur in a given document rather than in the collection to which the document belongs, (ii) the number of subject terms, i.e., signature terms that occur in headline or subject leading sentences, and (iii) the position of the sentence in the document. Since the HMM tends to select longer sentences to be included in a summary, 7 sentences are trimmed by removing lead adverbs and conjunctions, gerund phrases, and restricted relative-clause noun phrases.…”
Section: Comparing the Performance Of Corsum(-sf )mentioning
confidence: 99%