Abstract-Rapid progress in digital data acquisition techniques have led to huge volume of data. More than 80 percent of today's data is composed of unstructured or semi-structured data. The discovery of appropriate patterns and trends to analyze the text documents from massive volume of data is a big issue. Text mining is a process of extracting interesting and nontrivial patterns from huge amount of text documents. There exist different techniques and tools to mine the text and discover valuable information for future prediction and decision making process. The selection of right and appropriate text mining technique helps to enhance the speed and decreases the time and effort required to extract valuable information. This paper briefly discuss and analyze the text mining techniques and their applications in diverse fields of life. Moreover, the issues in the field of text mining that affect the accuracy and relevance of results are identified.
Blockchain is an emerging field which works on the concept of a digitally distributed ledger and consensus algorithm removing all the threats of intermediaries. Its early applications were related to the finance sector but now this concept has been extended to almost all the major areas of research including education, IoT, banking, supplychain, defense, governance, healthcare, etc. In the field of healthcare, stakeholders (provider, patient, payer, research organizations, and supply chain bearers) demand interoperability, security, authenticity, transparency, and streamlined transactions. Blockchain technology, built over the internet, has the potential to use the current healthcare data into peer to peer and interoperable manner by using a patient-centric approach eliminating the third party. Using this technology, applications can be built to manage and share secure, transparent and immutable audit trails with reduced systematic fraud. This study reviews existing literature in order to identify the major issues of various healthcare stakeholders and to explore the features of blockchain technology that could resolve identified issues. However, there are some challenges and limitations of this technology which are needed to be focused on future research.
This research reveals that from among the total 37 epidemiological weeks, the maximum impact was observed between weeks 22 and 27. The geographical flow and hotspots associated with dengue have been shown through thematic maps. A positive correlation between the risk for dengue and age was observed. The findings of this research can help health officials and decision-makers alert the public about future outbreaks and take preventive measures to considerably reduce the mortality and morbidity associated with the disease.
In natural language processing, text summarization is an important application used to extract desired information by reducing large text. Existing studies use keyword-based algorithms for grouping text, which do not give the documents' actual theme. Our proposed dynamic corpus creation mechanism combines metadata with summarized extracted text. The proposed approach analyzes the mesh of multiple unstructured documents and generates a linked set of multiple weighted nodes by applying multistage Clustering. We have generated adjacency graphs to link the clusters of various collections of documents. This approach comprises of ten steps: pre-processing, making multiple corpuses, first stage clustering, creating sub-corpuses, interlinking sub-corpuses, creating page rank keyword dictionary of each sub-corpus, second stage clustering, path creation among clusters of sub-corpuses, text processing by forward and backward propagation for results generation. The outcome of this technique consists of interlinked subcorpuses through clusters. We have applied our approach to a News dataset, and this interlinked corpus processing follows step by step clustering to search the most relevant parts of the corpus with less cost, time, and improve content detection. We have applied six different metadata processing combinations over multiple text queries to compare results during our experimentation. The comparison results of text satisfaction show that Page-Rank keywords give 38% related text, single-stage Clustering gives 46%, twostage Clustering gives 54%, and the proposed technique gives 67% associated text. Furthermore, this approach covers/searches the relevant data with a range of most to less relevant content. It provides the systematic query-relevant corpus processing mechanism, which automatically selects the most relevant subcorpus through dynamic path selection. We used the SHAP model to evaluate the proposed technique, and our evaluation results proved that the proposed mechanism improved text processing. Moreover, combining text summarization features, shown satisfactory results compared to the summaries generated by general models of abstractive & extractive summarization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.