Recommendation systems suggest relevant items to a user based on the similarity between users or between items. In a collaborative filtering approach for generating recommendations, there is a symmetry between the users. That is, if user A has similar interests with user B, then an item liked by B can be recommended to A and vice versa. To provide optimal and fast recommendations, a recommender system may generate and keep clusters of existing users/items. In this research work, a hybrid sparrow clustered (HSC) recommender system is developed, and is applied to the MovieLens dataset to demonstrate its effectiveness and efficiency. The proposed method (HSC) is also compared to other methods, and the results are compared. Precision, mean absolute error, recall, and accuracy metrics were used to figure out how well the movie recommender system worked for the HSC collaborative movie recommender system. The results of the experiment on the MovieLens dataset show that the proposed method is quite promising when it comes to scalability, performance, and personalized movie recommendations.
In this study, the authors aim to propose an optimized density-based algorithm for anomaly detection with focus on high-dimensional datasets. The optimization is achieved by optimizing the input parameters of the algorithm using firefly meta-heuristic. The performance of different similarity measures for the algorithm is compared including both L1 and L2 norms to identify the most efficient similarity measure for high-dimensional datasets. The algorithm is optimized further in terms of speed and scalability by using Apache Spark big data platform. The experiments were conducted on publicly available datasets, and the results were evaluated on various performance metrics like execution time, accuracy, sensitivity, and specificity.
We are now in Big Data era, and there is a growing demand for tools which can process and analyze it. Big data analytics deals with extracting valuable information from that complex data which can’t be handled by traditional data mining tools. This paper surveys the available tools which can handle large volumes of data as well as evolving data streams. The data mining tools and algorithms which can handle big data have also been summarized, and one of the tools has been used for mining of large datasets using distributed algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.