Abstract. The importance of text summarization grows rapidly as the amount of information increases exponentially. In this paper, we present new method for Persian Text Summarization based on fractal theory. The main goal of this method is using hierarchical structure of document and improves that for Persian language. The result shows that our method improves performance of extractive summarization.
The importance of detecting similar documents grows rapidly as the amount of information increases exponentially. This paper presents a new technique for identifying similar documents. It combines statistical properties of documents with Persian linguistic features. The proposed technique is mostly suited for detecting similar documents in specific fields. The proposed method is built on lexical chain of important words and based on term co-occurrence property of the text. It prevents the irrelevant documents to be identified similar due to polysemy property of the words. It also considers the order of words in identifying the similar documents. If a document consists of more than one subject, it could also be founded and similar documents according to different topics of the text could be detected. Our results shows improved performance compared to existing wordbased methods like LSI and VSM.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.