The purpose of this paper is to provide a more current evaluation and update of web mining research and techniques available. Current advances in each of the three different types of web mining are reviewed in the categories of web content mining, web usage mining, and web structure mining. For each tabulated research work, we examine such key issues as web mining process, methods/techniques, applications, data sources, and software used. Unlike previous investigators, we divide web mining processes into the following five subtasks: (1) resource finding and retrieving, (2) information selection and preprocessing, (3) patterns analysis and recognition, (4) validation and interpretation, and (5) visualization. This paper also reports the comparisons and summaries of selected software for web mining. The web mining software selected for discussion and comparison in this paper are SPSS Clementine, Megaputer PolyAnalyst, ClickTracks by web analytics, and QL2 by QL2 Software Inc. Applications of these selected web mining software to available data sets are discussed together with abundant presentations of screen shots, as well as conclusions and future directions of the research.
In this chapter, a discussion is presented of what a supercomputer really is, as well as of both the top few of the world's fastest supercomputers and the overall top 500 in the world. Discussions are also of cognitive science research using supercomputers for artificial intelligence, architectural classes of supercomputers, and discussion and visualization using tables and graphs of global supercomputing comparisons across different countries. Discussion of supercomputing applications and overview of other book chapters of the entire book are all presented. This chapter serves as an introduction to the entire book and concludes with a summary of the topics of the remaining chapters of this book.
This study demonstrates two visual methodologies to support analysts using artificial neural networks (ANNs) in data mining operations. The first part of the paper illustrates the differences and similarities between various learning rules that might be employed by ANN data miners. Since different learning rules lead to different connection weights and stability coefficients, a graphical representation of the data that provides a novel visual means of discerning these similarities and differences is demonstrated. The second part of this research demonstrates a methodology for ANN model variable interpretation that uses network connection weights. It uses empirical marketing data to optimize an ANN and response elasticity graphs are built for each ANN model variable by plotting the derivative of the network output with respect to each variable, while changing network input in equal increments across the range of inputs for each variable. Finally, this paper concludes that such an approach to ANN model interpretation can provide data miners with a rich interpretation of variable importance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.