Traffic forecasting is a vital part of intelligent transportation systems. It becomes particularly challenging due to short-term (e.g., accidents, constructions) and long-term (e.g., peak-hour, seasonal, weather) traffic patterns. While most of the previously proposed techniques focus on normal condition forecasting, a single framework for extreme condition traffic forecasting does not exist. To address this need, we propose to take a deep learning approach. We build a deep neural network based on long short term memory (LSTM) units. We apply Deep LSTM to forecast peak-hour traffic and manage to identify unique characteristics of the traffic data. We further improve the model for postaccident forecasting with Mixture Deep LSTM model. It jointly models the normal condition traffic and the pattern of accidents. We evaluate our model on a realworld large-scale traffic dataset in Los Angeles. When trained end-to-end with suitable regularization, our approach achieves 30%-50% improvement over baselines. We also demonstrate a novel technique to interpret the model with signal stimulation. We note interesting observations from the trained neural network.