AbstractThe term “big data” means a large amount of data, and big data management refers to the efficient handling, organization, or use of large volumes of structured and unstructured data belonging to an organization. Due to the gradual availability of plenty of raw data, the knowledge extraction process from big data is a very difficult task for most of the classical data mining and machine learning tools. In a previous paper, the correlative naive Bayes (CNB) classifier was developed for big data classification. This work incorporates the fuzzy theory along with the CNB classifier to develop the fuzzy CNB (FCNB) classifier. The proposed FCNB classifier solves the big data classification problem by using the MapReduce framework and thus achieves improved classification results. Initially, the database is converted to the probabilistic index table, in which data and attributes are presented in rows and columns, respectively. Then, the membership degree of the unique symbols present in each attribute of data is found. Finally, the proposed FCNB classifier finds the class of data based on training information. The simulation of the proposed FCNB classifier uses the localization and skin segmentation datasets for the purpose of experimentation. The results of the proposed FCNB classifier are analyzed based on the metrics, such as sensitivity, specificity, and accuracy, and compared with the various existing works.
The process of big data handling refers to the efficient management of storage and processing of a very large volume of data. The data in a structured and unstructured format require a specific approach for overall handling. The classifiers analyzed in this paper are correlative naïve Bayes classifier (CNB), Cuckoo Grey wolf CNB (CGCNB), Fuzzy CNB (FCNB), and Holoentropy CNB (HCNB). These classifiers are based on the Bayesian principle and work accordingly. The CNB is developed by extending the standard naïve Bayes classifier with applied correlation among the attributes to become a dependent hypothesis. The cuckoo search and grey wolf optimization algorithms are integrated with the CNB classifier, and significant performance improvement is achieved. The resulting classifier is called a cuckoo grey wolf correlative naïve Bayes classifier (CGCNB). Also, the performance of the FCNB and HCNB classifiers are analyzed with CNB and CGCNB by considering accuracy, sensitivity, specificity, memory, and execution time.
Modern systems like the Internet of Things, cloud computing, and sensor networks generate a huge data archive. The knowledge extraction from these huge archived data requires modified approaches in algorithm design techniques. The field of study in which analysis of such huge data is carried out is called big data analytics, which helps to optimize the performance with reduced cost and retrieves the information efficiently. The enhancement of traditional data analytics needs to modify to suit big data analytics because it may not manage huge amounts of data. The real thought is how to design the data mining algorithms suitable to handle big data analysis. This paper discusses data analytics at the initial level, to begin with, the insights about the analysis process for big data. Big data analytics have a current research edge in the knowledge extraction field. This paper highlights the challenges and problems associated with big data analysis and provide inner insights into several techniques and methods used.
In recent days, big data is a vital role in information knowledge analysis, predicting and manipulating process. Moreover, big data is well-known for systematic extraction and analysis of large or difficult databases. Furthermore, it is widely useful in data management as compared with conventional data processing approach. The development in big data is highly increasing gradually, such that traditional software tool faced various issues during big data handling. However, data imbalance in huge databases is a main limitation in research area. The scaling evolution up to huge scale database is very challenging task in big data era. In this paper, Grey wolf Shuffled Shepherd Optimization Algorithm (GWSSOA)-based Deep Recurrent Neural Network (DRNN) algorithm is devised for big data classification. In this technique, hybrid classifier, termed as Holoentropy based Correlative Naive Bayes classifier (HCNB) and DRNN classifier is introduced for the classification of big data.
The process of big data handling refers the efficient management of storage and processing of very large volume of data. The data in a structured and an unstructured format require specific approach for overall handling.The classifiers analyzed in this paper are correlative naïve bayes classifier (CNB), Cuckoo Grey wolf CNB (CGCNB), Fuzzy CNB (FCNB), and Holoentropy CNB (HCNB). These classifiers are based on Bayesian principle and work accordingly. The CNB is developed by extending the standard naïve bayes classifier with applied correlation among the attributes so that it becomes a dependent hypothesis and it is named as a correlative naïve bayes classifier (CNB). The cuckoo search and grey wolf optimization algorithms are integrated with the CNB classifier and significant performance improvement is achieved. The resulting classifier is called as cuckoo grey wolf correlative naïve bayes classifier (CGCNB). The further performance improvements are achieved by incorporating fuzzy theory termed as fuzzy correlative naïve bayes classifier (FCNB) and holoentropy theory termed as Holoentropy correlative naïve bayes classifier (HCNB) respectively. FCNB and HCNB classifiers are comparatively analyzed with CNB and CGCNB and achieved noticeable performance by analyzing with accuracy, sensitivity and specificity analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.