Text Mining has become an important research area. Text Mining is the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources. In this paper, a Survey of Text Mining techniques and applications have been s presented.
Text Summarization is condensing the source text into a shorter version preserving its information content and overall meaning. It is very difficult for human beings to manually summarize large documents of text. Text Summarization methods can be classified into extractive and Abstractive summarization. An extractive summarization method consists of selecting important sentences, paragraphs etc. from the original document and concatenating them into shorter form. The importance of sentences is decided based on statistical and linguistic features of sentences. An Abstractive summarization method consists of understanding the original text and re-telling it in fewer words. It uses linguistic methods to examine and interpret the text and then to find the new concepts and expressions to best describe it by generating a new shorter text that conveys the most important information from the original text document. In this paper, a Survey of Text Summarization Extractive techniques has been presented.
Question Answering (QA) is a specific type of information retrieval. Given a set of documents, a Question Answering system attempts to find out the correct answer to the question pose in natural language. Question answering is multidisciplinary. It involves information technology, artificial intelligence, natural language processing, knowledge and database management and cognitive science. From the technological perspective, question answering uses natural or statistical language processing, information retrieval, and knowledge representation and reasoning as potential building blocks. It involves text classification, information extraction and summarization technologies. In general, question answering system (QAS) has three components such as question classification, information retrieval, and answer extraction. These components play a essential role in QAS. Question classification play primary role in QA system to categorize the question based upon on the type of its entity. Information retrieval method is get of identify success by extracting out applicable answer post by their intelligent question answering system. Finally, answer extraction module is rising topics in the QAS where these systems are often requiring ranking and validating a candidate's answer.Most of the Question Answering systems consists of three main modules: question processing, document processing and answer processing. Question processing module plays an important part in QA systems. If this module doesn't work correctly, it will make problems for other sections. Moreover answer processing module is an emerging topic in Question Answering, in which these systems are often required to rank and validate candidate answers. These techniques aiming at discovering the short and precise answers are often based on the semantic classification. QA systems give the ability to answer questions posed in natural language by extracting, from a repository of documents, fragments of documents that contain material relevant to the answer.
Sentiment Analysis (SA), an application of Natural Language processing (NLP), has been witnessed a blooming interest over the past decade. It is also known as opinion mining, mood extraction and emotion analysis. The basic in opinion mining is classifying the polarity of text in terms of positive (good), negative (bad) or neutral (surprise). Mood Extraction automates the decision making performed by human. It is the important aspect for capturing public opinion about product preferences, marketing campaigns, political movements, social events and company strategies. In addition to sentiment analysis for English and other European languages, this task is applied on various Indian languages like Bengali, Hindi, Telugu and Malayalam. This paper describes the survey on main approaches for performing sentiment extraction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.