This paper presents a method to improve the performance of Information Retrieval System (IRS) by increasing the no of relevant documents retrieved. There are several types of uncertainty and fuzziness associated with IRS like search term uncertainty, relevance uncertainty involved in retrieving of irrelevant documents. The aim of this paper is to eliminate different types of uncertainty and increase the chance of retrieving relevant documents. In the framework a method is proposed which first calculate query and document cluster similarity which not only retrieve the documents matching query terms as well as similar to retrieved documents by calculating the query and cluster similarity. This helps to reduce search term uncertainty and tries to reduce the fuzziness associated with document relevance in two steps. First modification is made in general term frequency-inverse document frequency (tf-idt) scoring mechanism to give importance of informativeness of a document contents and secondly calculating query and document summary overlap. All the above information is used to measure the document relevant score. Finally retrieved documents are fIltered by Pearson correlation coefficient between query vector and document vector to find out only those documents correlated with query. In experiment standard NPL test collection prepared by Vaswani and Cameron at the National Physical Laboratory in England was used. After full implementation of above methodology it was found that proposed work is better in comparison with existing methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.