Zhanxing Zhu scite author profile

Timely accurate traffic forecast is crucial for urban traffic control and guidance. Due to the high nonlinearity and complexity of traffic flow, traditional methods cannot satisfy the requirements of mid-and-long term prediction tasks and often neglect spatial and temporal dependencies. In this paper, we propose a novel deep learning framework, Spatio-Temporal Graph Convolutional Networks (STGCN), to tackle the time series prediction problem in traffic domain. Instead of applying regular convolutional and recurrent units, we formulate the problem on graphs and build the model with complete convolutional structures, which enable much faster training speed with fewer parameters. Experiments show that our model STGCN effectively captures comprehensive spatio-temporal correlations through modeling multi-scale traffic networks and consistently outperforms state-of-the-art baselines on various real-world traffic datasets.

show abstract

Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labeled Nodes

Sun

Lin

Zhu

2020

AAAI

186

103

View full text Add to dashboard Cite

Graph Convolutional Networks (GCNs) play a crucial role in graph learning tasks, however, learning graph embedding with few supervised signals is still a difficult problem. In this paper, we propose a novel training algorithm for Graph Convolutional Network, called Multi-Stage Self-Supervised (M3S) Training Algorithm, combined with self-supervised learning approach, focusing on improving the generalization performance of GCNs on graphs with few labeled nodes. Firstly, a Multi-Stage Training Framework is provided as the basis of M3S training method. Then we leverage DeepCluster technique, a popular form of self-supervised learning, and design corresponding aligning mechanism on the embedding space to refine the Multi-Stage Training Framework, resulting in M3S Training Algorithm. Finally, extensive experimental results verify the superior performance of our algorithm on graphs with few labeled nodes under different label rates compared with other state-of-the-art approaches.

show abstract

Spatial-Temporal Fusion Graph Neural Networks for Traffic Flow Forecasting

Zhu

2021

AAAI

310

View full text Add to dashboard Cite

Spatial-temporal data forecasting of traffic flow is a challenging task because of complicated spatial dependencies and dynamical trends of temporal pattern between different roads. Existing frameworks usually utilize given spatial adjacency graph and sophisticated mechanisms for modeling spatial and temporal correlations. However, limited representations of given spatial graph structure with incomplete adjacent connections may restrict effective spatial-temporal dependencies learning of those models. Furthermore, existing methods were out at elbows when solving complicated spatial-temporal data: they usually utilize separate modules for spatial and temporal correlations, or they only use independent components capturing localized or global heterogeneous dependencies. To overcome those limitations, our paper proposes a novel Spatial-Temporal Fusion Graph Neural Networks (STFGNN) for traffic flow forecasting. First, a data-driven method of generating “temporal graph” is proposed to compensate several genuine correlations that spatial graph may not reflect. STFGNN could effectively learn hidden spatial-temporal dependencies by a novel fusion operation of various spatial and temporal graphs, treated for different time periods in parallel. Meanwhile, by integrating this fusion graph module and a novel gated convolution module into a unified layer parallelly, STFGNN could handle long sequences by learning more spatial-temporal dependencies with layers stacked. Experimental results on several public traffic datasets demonstrate that our method achieves state-of-the-art performance consistently than other baselines.

show abstract

Spatio-Temporal Manifold Learning for Human Motions via Long-Horizon Modeling

Wang

Shum

et al. 2021

IEEE Trans. Visual. Comput. Graphics

View full text Add to dashboard Cite

Data-driven modeling of human motions is ubiquitous in computer graphics and computer vision applications, such as synthesizing realistic motions or recognizing actions. Recent research has shown that such problems can be approached by learning a natural motion manifold using deep learning on a large amount data, to address the shortcomings of traditional data-driven approaches. However, previous deep learning methods can be sub-optimal for two reasons. First, the skeletal information has not been fully utilized for feature extraction. Unlike images, it is difficult to define spatial proximity in skeletal motions in the way that deep networks can be applied for feature extraction. Second, motion is time-series data with strong multi-modal temporal correlations between frames. On the one hand, a frame could be followed by several candidate frames leading to different motions; on the other hand, long-range dependencies exist where a number of frames in the beginning correlate to a number of frames later. Ineffective temporal modeling would either under-estimate the multi-modality and variance, resulting in featureless mean motion or over-estimate them resulting in jittery motions, which is a major source of visual artifacts. In this paper, we propose a new deep network to tackle these challenges by creating a natural motion manifold that is versatile for many applications. The network has a new spatial component for feature extraction. It is also equipped with a new batch prediction model that predicts a large number of frames at once, such that long-term temporally-based objective functions can be employed to correctly learn the motion multi-modality and variances. With our system, long-duration motions can be predicted/synthesized using an open-loop setup where the motion retains the dynamics accurately. It can also be used for denoising corrupted motions and synthesizing new motions with given control signals. We demonstrate that our system can create superior results comparing to existing work in multiple applications.

show abstract

Learning with Noise: Enhance Distantly Supervised Relation Extraction with Dynamic Transition Matrix

Luo¹,

Feng²,

Wang³

et al. 2017

View full text Add to dashboard Cite

Distant supervision significantly reduces human efforts in building training data for many classification tasks. While promising, this technique often introduces noise to the generated training data, which can severely affect the model performance. In this paper, we take a deep look at the application of distant supervision in relation extraction. We show that the dynamic transition matrix can effectively characterize the noise in the training data built by distant supervision. The transition matrix can be effectively trained using a novel curriculum learning based method without any direct supervision about the noise. We thoroughly evaluate our approach under a wide range of extraction scenarios. Experimental results show that our approach consistently improves the extraction results and outperforms the state-of-the-art in various evaluation scenarios.

show abstract

The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects

Zhu¹,

Wu²,

Yu³

et al. 2018

Preprint

View full text Add to dashboard Cite

Understanding the behavior of stochastic gradient descent (SGD) in the context of deep neural networks has raised lots of concerns recently. Along this line, we study a general form of gradient based optimization dynamics with unbiased noise, which unifies SGD and standard Langevin dynamics. Through investigating this general optimization dynamics, we analyze the behavior of SGD on escaping from minima and its regularization effects. A novel indicator is derived to characterize the efficiency of escaping from minima through measuring the alignment of noise covariance and the curvature of loss function. Based on this indicator, two conditions are established to show which type of noise structure is superior to isotropic noise in term of escaping efficiency. We further show that the anisotropic noise in SGD satisfies the two conditions, and thus helps to escape from sharp and poor minima effectively, towards more stable and flat minima that typically generalize well. We systematically design various experiments to verify the benefits of the anisotropic noise, compared with full gradient descent plus isotropic diffusion (i.e. Langevin dynamics).

show abstract

You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle

Zhang¹,

Zhang²,

Lu³

et al. 2019

Preprint

View full text Add to dashboard Cite

Deep learning achieves state-of-the-art results in many tasks in computer vision and natural language processing. However, recent works have shown that deep networks can be vulnerable to adversarial perturbations, which raised a serious robustness issue of deep networks. Adversarial training, typically formulated as a robust optimization problem, is an effective way of improving the robustness of deep networks. A major drawback of existing adversarial training algorithms is the computational overhead of the generation of adversarial examples, typically far greater than that of the network training. This leads to the unbearable overall computational cost of adversarial training. In this paper, we show that adversarial training can be cast as a discrete time differential game. Through analyzing the Pontryagin's Maximum Principle (PMP) of the problem, we observe that the adversary update is only coupled with the parameters of the first layer of the network. This inspires us to restrict most of the forward and back propagation within the first layer of the network during adversary updates. This effectively reduces the total number of full forward and backward propagation to only one for each group of adversary updates. Therefore, we refer to this algorithm YOPO (You Only Propagate Once). Numerical experiments demonstrate that YOPO can achieve comparable defense accuracy with approximately 1/5 ∼ 1/4 GPU time of the projected gradient descent (PGD) algorithm [15] . 2 * Equal Contribution 2 Our codes are available at https://github.com/a1600012888/YOPO-You-Only-Propagate-Once Preprint. Under review.

show abstract

Numerical study of dc arc plasma and molten bath in dc electric arc furnace

Wang¹,

Jin²,

Zhu³

2006

Ironmaking & Steelmaking

View full text Add to dashboard Cite

A numerical study of the arc plasma and molten bath in a dc electric arc furnace (EAF) is useful for understanding and improving the production process of the dc EAF. In the present paper, a mathematical model based on conservation equations of mass, momentum and energy along with Maxwell's equations is developed to describe the flow field and heat transfer in the arc and molten bath regions of a dc EAF simultaneously. The governing equations are solved using the PHOENICS software package. The calculated results show good agreement with those of previous studies, giving a useful insight into the factors of arc heat transfer and bath circulation.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhanxing Zhu

Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting

Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labeled Nodes

Spatial-Temporal Fusion Graph Neural Networks for Traffic Flow Forecasting

Spatio-Temporal Manifold Learning for Human Motions via Long-Horizon Modeling

Learning with Noise: Enhance Distantly Supervised Relation Extraction with Dynamic Transition Matrix

The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects

You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle

Numerical study of dc arc plasma and molten bath in dc electric arc furnace

Contact Info

Product

Resources

About