Collaborative filtering recommender systems traditionally recommend products to users solely based on the given user-item rating matrix. Two main issues, data sparsity and scalability, have long been concerns. In our previous work, an approach was proposed to address the scalability issue by clustering the products using the content of the user-item rating matrix. However, it still suffers from these concerns. In this paper, we improve the approach by employing user comments to address the issues of data sparsity and scalability. Word2Vec is applied to produce item vectors, one item vector for each product, from the comments made by users on their previously bought goods. Through the user-item rating matrix, the user vectors of all the customers are produced. By clustering, products and users are partitioned into item groups and user groups, respectively. Based on these groups, recommendations to a user can be made. Experimental results show that both the inaccuracy caused by a sparse user-item rating matrix and the inefficiency due to an enormous amount of data can be much alleviated.
In this paper, we tackle air quality forecasting by using deep learning approaches to predict the hourly concentration of air pollutants (e.g., ozone, particle matter PM2.5 and sulfur dioxide). Deep learning (DL), as one of the most popular techniques, is able to efficiently train a scalable model on big data by optimization algorithms. The model is trained for air quality prediction with time series data. Our method takes the deep convolutional neural network (CNN) as the sequence module and inputs the time series data into the CNN model in turn for training. CNN is composed of many functional layers, such as convolution, pooling and ReLU. Convolution layer can effectively extract the sequential features of time series data. Sequential features work better than general features of time series data. Down-sampling in CNN is performed by the Pooling layer. Experimental results show that CNN performs well for air quality prediction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.