We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors and are jointly learned with distributed vector representations of word tokens using a hierarchical framework with two embedded neural language models. In particular, we exploit the context of documents in streams and use one of the language models to model the document sequences, and the other to model word sequences within them. The models learn continuous vector representations for both word tokens and documents such that semantically similar documents and words are close in a common vector space. We discuss extensions to our model, which can be applied to personalized recommendation and social relationship mining by adding further user layers to the hierarchy, thus learning user-specific vectors to represent individual preferences. We validated the learned representations on a public movie rating data set from MovieLens, as well as on a large-scale Yahoo News data comprising three months of user activity logs collected on Yahoo servers. The results indicate that the proposed model can learn useful representations of both documents and word tokens, outperforming the current state-of-the-art by a large margin.
Modeling trajectory data is a building block for many smart-mobility initiatives. Existing approaches apply shallow models such as Markov chain and inverse reinforcement learning to model trajectories, which cannot capture the long-term dependencies. On the other hand, deep models such as Recurrent Neural Network (RNN) have demonstrated their strength of modeling variable length sequences. However, directly adopting RNN to model trajectories is not appropriate because of the unique topological constraints faced by trajectories. Motivated by these findings, we design two RNNbased models which can make full advantage of the strength of RNN to capture variable length sequence and meanwhile to address the constraints of topological structure on trajectory modeling. Our experimental study based on real taxi trajectory datasets shows that both of our approaches largely outperform the existing approaches.
Estimating the travel time of a path is of great importance to smart urban mobility. Existing approaches are either based on estimating the time cost of each road segment which are not able to capture many cross-segment complex factors, or designed heuristically in a non-learning-based way which fail to utilize the existing abundant temporal labels of the data, i.e., the time stamp of each trajectory point. In this paper, we leverage on new development of deep neural networks and propose a novel auxiliary supervision model, namely DEEPTRAVEL, that can automatically and effectively extract different features, as well as make full use of the temporal labels of the trajectory data. We have conducted comprehensive experiments on real datasets to demonstrate the out-performance of DEEPTRAVEL over existing approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.