Benchmarking State-of-the-Art Deep Learning Software Tools

Shi, Shaohuai; Wang, Qiang; Xu, Pengfei; Chu, Xiaowen

doi:10.1109/ccbd.2016.029

Cited by 254 publications

(166 citation statements)

References 23 publications

(40 reference statements)

Supporting

Mentioning

153

Contrasting

Unclassified

Order By: Relevance

“…For readers with interest in deep learning software tools, in 29 we can find an interesting experimental analysis on CPU and GPU platforms for some well known GPU-accelerated tools, including: Caffe (developed by the Berkeley Vision and Learning Center), CNTK (developed by Microsoft Research), Tensorflow (developed by Google) and Torch.…”

Section: Deep Learning Softwarementioning

confidence: 99%

A snapshot of image pre-processing for convolutional neural networks: case study of MNIST

Tabik¹,

Peralta²,

Herrera-Poyatos³

et al. 2017

IJCIS

View full text Add to dashboard Cite

In the last five years, deep learning methods and particularly Convolutional Neural Networks (CNNs) have exhibited excellent accuracies in many pattern classification problems. Most of the state-of-the-art models apply data-augmentation techniques at the training stage. This paper provides a brief tutorial on data preprocessing and shows its benefits by using the competitive MNIST handwritten digits classification problem. We show and analyze the impact of different preprocessing techniques on the performance of three CNNs, LeNet, Network3 and DropConnect, together with their ensembles. The analyzed transformations are, centering, elastic deformation, translation, rotation and different combinations of them. Our analysis demonstrates that data-preprocessing techniques, such as the combination of elastic deformation and rotation, together with ensembles have a high potential to further improve the state-of-the-art accuracy in MNIST classification.

show abstract

Section: Deep Learning Softwarementioning

confidence: 99%

A snapshot of image pre-processing for convolutional neural networks: case study of MNIST

Tabik¹,

Peralta²,

Herrera-Poyatos³

et al. 2017

IJCIS

View full text Add to dashboard Cite

show abstract

“…The selection of the Caffe framework and the training scheme follows the common acknowledgment in the deep learning community [36,49]. Although there are plenty of selection of deep learning framework to use (such as TensorFlow, Torch), the accuracy-wise performance has only a very limited variation [49].…”

Section: Methodsmentioning

confidence: 99%

“…Although there are plenty of selection of deep learning framework to use (such as TensorFlow, Torch), the accuracy-wise performance has only a very limited variation [49]. The choice of training scheme has also undergone extensive investigation [50] and we follow [36] because of the similar network architecture.…”

Section: Methodsmentioning

confidence: 99%

Maritime Semantic Labeling of Optical Remote Sensing Images with Multi-Scale Fully Convolutional Network

2017

View full text Add to dashboard Cite

Abstract:In current remote sensing literature, the problems of sea-land segmentation and ship detection (including in-dock ships) are investigated separately despite the high correlation between them. This inhibits joint optimization and makes the implementation of the methods highly complicated. In this paper, we propose a novel fully convolutional network to accomplish the two tasks simultaneously, in a semantic labeling fashion, i.e., to label every pixel of the image into 3 classes, sea, land and ships. A multi-scale structure for the network is proposed to address the huge scale gap between different classes of targets, i.e., sea/land and ships. Conventional multi-scale structure utilizes shortcuts to connect low level, fine scale feature maps to high level ones to increase the network's ability to produce finer results. In contrast, our proposed multi-scale structure focuses on increasing the receptive field of the network while maintaining the ability towards fine scale details. The multi-scale convolution network accommodates the huge scale difference between sea-land and ships and provides comprehensive features, and is able to accomplish the tasks in an end-to-end manner that is easy for implementation and feasible for joint optimization. In the network, the input forks into fine-scale and coarse-scale paths, which share the same convolution layers to minimize network parameter increase, and then are joined together to produce the final result. The experiments show that the network tackles the semantic labeling problem with improved performance.

show abstract

“…Workers frequently fetch the up-to-date model from the PS, make computation over the data they host, and then return gradient updates to the PS. Since DNN models are large (from thousands to billions parameters [9]), placing those worker tasks over edge-devices imply significant updates transfer over the Internet. The PS being in a central location (typically at a cloud provider), the question of inbound traffic is also crucial for pricing our proposal.…”

Section: Introductionmentioning

confidence: 99%

Distributed deep learning on edge-devices: Feasibility via adaptive compression

Hardy

Merrer

Séricola³

2017

2017 IEEE 16th International Symposium on Network Computing and Applications (NCA)

View full text Add to dashboard Cite

Abstract-A large portion of data mining and analytic services use modern machine learning techniques, such as deep learning. The state-of-the-art results by deep learning come at the price of an intensive use of computing resources. The leading frameworks (e.g., TensorFlow) are executed on GPUs or on high-end servers in datacenters. On the other end, there is a proliferation of personal devices with possibly free CPU cycles; this can enable services to run in users' homes, embedding machine learning operations. In this paper, we ask the following question: Is distributed deep learning computation on WAN connected devices feasible, in spite of the traffic caused by learning tasks? We show that such a setup rises some important challenges, most notably the ingress traffic that the servers hosting the up-to-date model have to sustain.In order to reduce this stress, we propose AdaComp, a novel algorithm for compressing worker updates to the model on the server. Applicable to stochastic gradient descent based approaches, it combines efficient gradient selection and learning rate modulation. We then experiment and measure the impact of compression, device heterogeneity and reliability on the accuracy of learned models, with an emulator platform that embeds TensorFlow into Linux containers. We report a reduction of the total amount of data sent by workers to the server by two order of magnitude (e.g., 191-fold reduction for a convolutional network on the MNIST dataset), when compared to a standard asynchronous stochastic gradient descent, while preserving model accuracy.

show abstract

Benchmarking State-of-the-Art Deep Learning Software Tools

Cited by 254 publications

References 23 publications

A snapshot of image pre-processing for convolutional neural networks: case study of MNIST

A snapshot of image pre-processing for convolutional neural networks: case study of MNIST

Maritime Semantic Labeling of Optical Remote Sensing Images with Multi-Scale Fully Convolutional Network

Distributed deep learning on edge-devices: Feasibility via adaptive compression

Contact Info

Product

Resources

About