An interesting research area that permits the user to mine the significant information, called frequent subgraph, is Graph-Based Data Mining (GBDM). One of the well-known algorithms developed to extract frequent patterns is GASTON algorithm. Retrieving the interesting webpages from the log files contributes heavily to various applications. In this work, a webpage recommendation system has been proposed by introducing Chronological Cuckoo Search (Chronological-CS) algorithm and the Laplace correction based k-Nearest Neighbor (LKNN) to retrieve the useful webpage from the interesting webpage. Initially, W-Gaston algorithm extracts the interesting subgraph from the log files and provides it to the proposed webpage recommendation system. The interesting subgraphs subjected to clustering with the proposed Chronological-CS algorithm, which is developed by integrating the chronological concept into Cuckoo Search (CS) algorithm, provide various cluster groups. Then, the proposed LKNN algorithm recommends the webpage from the clusters. Simulation of the proposed webpage recommendation algorithm is done by utilizing the data from MSNBC and weblog database. The results are compared with various existing webpage recommendation models and analyzed based on precision, recall, and F-measure. The proposed webpage recommendation model achieved better performance than the existing models with the values of 0.9194, 0.8947, and 0.86736, respectively, for the precision, recall, and F-measure.
Graph-Based Data Mining (GBDM) is an emerging research topic nowadays, for the retrieval of the essential information from the graph database. There exist many algorithms that find frequent patterns in a given graph database. One such algorithm, GASTON uses support based on frequency to discover frequent patterns. The discovery phase in the Gaston algorithm is time-consuming, and the pages captured the interest of the users are ignored by the existing GASTON algorithm. This paper proposes an algorithm, Weighted-Gaston (W-Gaston) algorithm, by modifying the existing Gaston algorithm. Here, four interesting measures are developed based on the frequency, entropy, and the page duration, for the retrieval of the interesting sub-graphs. The proposed interesting measures include four types of support: (1) Support based on the page duration (W-Support), (2) Support based on the entropy (E-Support), (3) Support based on the page duration and the entropy (WE-Support), and (4) Support based on the frequency, page duration, and the entropy (FWE-Support). The simulation of the proposed work is done using the MSNBC and the weblog databases. The experimental results show that the proposed algorithm performed well as compared with the existing algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.