In this paper we have focused a variety of techniques, approaches and different areas of the research which are helpful and marked as the important field of data mining Technologies. As we are aware that many MNC's and large organizations are operated in different places of the different countries. Each place of operation may generate large volumes of data. Corporate decision makers require access from all such sources and take strategic decisions .The data warehouse is used in the significant business value by improving the effectiveness of managerial decision-making. In an uncertain and highly competitive business environment, the value of strategic information systems such as these are easily recognized however in today's business environment, efficiency or speed is not the only key for competitiveness. This type of huge amount of data's are available in the form of tera-to peta-bytes which has drastically changed in the areas of science and engineering. To analyze, manage and make a decision of such type of huge amount of data we need techniques called the data mining which will transforming in many fields. This paper imparts more number of applications of the data mining and also o focuses scope of the data mining which will helpful in the further research.
The diagnosis of the Healthcare systems are playing prominent role with the advancement of latest technologies. Decision support systems may generate fruitful outcomes for better diagnosis for Breast cancer. The present context of the paper describes about BCD-NFIS that merely reduces usage of featured datasets using Fuzzy networks and produces enhanced accuracy of 98.24% much better results than the older approaches. The BCD-NFIS uses the methodology of Inference systems, Neural Fuzzy logic and BCD to overcome the problems. This would be very much helpful for rising physicians to simplify the diagnosis patterns through exploiting Information technology for Breast cancer.
Many techniques have been proposed to implement the Apriori algorithm on MapReduce framework but only a few have focused on performance improvement. FPC (Fixed Passes Combined-counting) and DPC (Dynamic Passes Combined-counting) algorithms combine multiple passes of Apriori in a single MapReduce phase to reduce the execution time. In this paper, we propose improved MapReduce based Apriori algorithms VFPC (Variable Size based Fixed Passes Combined-counting) and ETDPC (Elapsed Time based Dynamic Passes Combined-counting)over FPC and DPC. Further, we optimize the multi-pass phases of these algorithms by skipping pruning step in some passes, and propose Optimized-VFPC and Optimized-ETDPC algorithms. Quantitative analysis reveals that counting cost of additional un-pruned candidates produced due to skipped-pruning is less significant than reduction in computation cost due to the same. Experimental results show that VFPC and ETDPC are more robust and flexible than FPC and DPC whereas their optimized versions are more efficient in terms of execution time.
In the modern age and many prestigious applications use the recommendation method to play an important role. The system of recommendations collected apps, built a global village and provided enough information for development. This paper presents an overview of the approaches and techniques produced in the recommendation framework for collaborative filtering. Collaborative filtering, material and hybrid methods were the method of recommendation. In producing personalised recommendation the technique of collaborative filtering is particularly effective. There have been several algorithms over ten years of study, but no distinctions have been made between the various strategies. Indeed, there is not yet a widely agreed way to test a collaborative filtering algorithm. In this work we compare various literature techniques and review each one’s characteristics to emphasise their key strengths and weaknesses.
Mining frequent itemsets from massive datasets is always being a most important problem of data mining. Apriori is the most popular and simplest algorithm for frequent itemset mining. To enhance the efficiency and scalability of Apriori, a number of algorithms have been proposed addressing the design of efficient data structures, minimizing database scan and parallel and distributed processing. MapReduce is the emerging parallel and distributed technology to process big datasets on Hadoop Cluster. To mine big datasets it is essential to re-design the data mining algorithm on this new paradigm. In this paper, we implement three variations of Apriori algorithm using data structures hash tree, trie and hash table trie i.e. trie with hash technique on MapReduce paradigm. We emphasize and investigate the significance of these three data structures for Apriori algorithm on Hadoop cluster, which has not been given attention yet. Experiments are carried out on both real life and synthetic datasets which shows that hash table trie data structures performs far better than trie and hash tree in terms of execution time. Moreover the performance in case of hash tree becomes worst.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.