Association rule data mining is an important technique for finding important relationships in large datasets.Several frequent itemsets mining techniques have been proposed using a prefix-tree structure, FP-tree, a compressed data structure for database representation. The DIFFset data structure has also been shown to significantly reduce the run time and memory utilization of some data mining algorithms. Experimental results have demonstrated the efficiency of the two data structures in frequent itemsets mining. This work proposes FDM, a new algorithm based on FP-tree and DIFFset data structures for efficiently discovering frequent patterns in data. FDM can adapt its characteristics to efficiently mine long and short patterns from both dense and sparse datasets. Several optimization techniques are also outlined to increase the efficiency of FDM. An evaluation of FDM against three frequent itemset data mining algorithms, dEclat, FP-growth, and FDM* (FDM without optimization), was performed using datasets having both long and short frequent patterns. The experimental results show significant improvement in performance compared to the FP-growth, dEclat, and FDM* algorithms.
Mobile phone technology initiatives are revolutionizing healthcare delivery in Africa and other developing countries. M-health services have transformed maternal health, management of communicable diseases such as Ebola and prevention of chronic diseases. Technological innovations in m-health have improved healthcare efficiency and effectiveness as well as extending health services to remote locations in rural African communities. This paper describes a ubiquitous m- health system that is based on the user centric paradigm of Mobile Cloud Computing (MCC) and android medical-data mining techniques. The development of ultra-fast 4G mobile networks and sophisticated smartphones and tablets has brought the cloud computing paradigm to the mobile domain.The system’s client side is based on an android platform for breast bio-data collection; a data mining technique based on Naïve Bayes probabilistic classifier (NBC) algorithm for predicting malignancy in breast tissue and the server-side MCC data storage. Experimental results indicate that the android Naïve Bayes classifier achieves 96.4% accuracy on Wisconsin Breast Cancer (WBC) data from UCI machine learning database.
Health Data collection is one of the major components of public health systems. Decision makers, policy makers, and medical service providers need accurate and timely data in order to improve the quality of health services. The rapid growth and use of mobile technologies has exerted pressure on the demand for mobile-based data collection solutions to bridge the information gaps in the health sector. We propose a prototype using open source data collection frameworks to test its feasibility in improving the vaccination data collection in Kenya. KenVACS, the proposed prototype, offers ways of collecting vaccination data through mobile phones and visualizes the collected data in a web application; the system also sends reminder short messages service (SMS) to remind parents on the date of the next vaccination. Early evaluation demonstrates the benefits of such a system in supporting and improving vaccination of children. Finally, we conducted a qualitative study to assess challenges in remote health data collection and evaluated usability and functionality of KenVACS.
Frequent pattern mining (FPM) is a very important technique in data mining and has attracted a wide range of practical applications. Equivalent Class Clustering (Eclat) has been identified as one of the most efficient FPM algorithm. We present P-Eclat, a novel parallel FPM algorithm which is an improvement of the Eclat algorithm, where a partial breadth-first search is employed to achieve maximum parallelism. Our approach uses a TIDset representation of the vertical transaction lists across multiple threads on a CPU. Current parallelization techniques for mining frequent patterns don’t fully utilize benefits accrued from multi core shared memory machines. Our parallel mining approach reduces the synchronization requirements, maximizing independence of data and enhances scalability. We also introduce several optimization techniques to improve the algorithm’s performance. Experimental results show that P-Eclat algorithm outperforms both Eclat and dEclat algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.