Web Usage mining is a technique used to identify the user needs from the web log. Discovering hidden patterns from the logs is an upcoming research area. Association rules play an important role in many web mining applications to detect interesting patterns. However, it generates enormous rules that cause researchers to spend ample time and expertise to discover the really interesting ones. This paper works on the server logs from the MSNBC dataset for the month of September 1999. This research aims at predicting the probable subsequent page in the usage of web pages listed in this data based on their navigating behaviour by using Apriori prefix tree (PT) algorithm. The generated rules were ranked based on the support, confidence and lift evaluation measures. The final predictions revealed that the interestingness of pages mainly depended on the support and lift measure whereas confidence assumed a uniform value among all the pages. It proved that the system guaranteed 100% confidence with the support of 1.3E−05. It revealed that the pages such as Front page, On-air, News, Sports and BBS attracted more interested subsequent users compared to Travel, MSN-News and MSN-Sports which were of less interest.
Machine Learning techniques have been useful in almost every field of concern. Data Mining, a branch of Machine Learning is one of the most extensively used techniques. The ever-increasing demands in the field of medicine are being addressed by computational approaches in which Big Data analysis, image processing and data mining are on top priority. These techniques have been exploited in the domain of ophthalmology for better retinal fundus image analysis. Blood vessels, one of the most significant retinal anatomical structures are analysed for diagnosis of many diseases like retinopathy, occlusion and many other vision threatening diseases. Vessel segmentation can also be a pre-processing step for segmentation of other retinal structures like optic disc, fovea, microneurysms, etc. In this paper, blood vessel segmentation is attempted through image processing and data mining techniques. The retinal blood vessels were segmented through color space conversion and color channel extraction, image pre-processing, Gabor filtering, image postprocessing, feature construction through application of principal component analysis, k-means clustering and first level classification using Naïve-Bayes classification algorithm and second level classification using C4.5 enhanced with bagging techniques. Association of every pixel against the feature vector necessitates Big Data analysis. The proposed methodology was evaluated on a publicly available database, STARE. The results reported 95.05% accuracy on entire dataset; however the accuracy was 95.20% on normal images and 94.89% on pathological images. A comparison of these results with the existing methodologies is also reported. This methodology can help ophthalmologists in better and faster analysis and hence early treatment to the patients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.