Predicting the Computational Cost of Deep Learning Models

Justus, Daniel; Brennan, John; Bonner, Stephen; McGough, Andrew Stephen

doi:10.1109/bigdata.2018.8622396

Cited by 198 publications

(130 citation statements)

References 12 publications

Supporting

Mentioning

129

Contrasting

Order By: Relevance

“… The learning capacity of CNN is generally enhanced by increasing the size of the network and it can be done in reasonable time with the help of the current advanced hardware technology such as Nvidia DGX-2 supercomputer. However, the training of deep and high capacity architectures is still a significant overhead on memory usage and computational resources [244]- [246]. Consequently, we still require a lot of improvements in hardware technology that can accelerate research in CNNs.…”

Section: Future Directionsmentioning

confidence: 99%

A survey of the recent architectures of deep convolutional neural networks

et al. 2020

View full text Add to dashboard Cite

Deep Convolutional Neural Networks (CNNs) are a special type of Neural Networks, which have shown exemplary performance on several competitions related to Computer Vision and Image Processing. Interesting application areas of CNN include Image Classification and Segmentation, Object Detection, Video Processing, Natural Language Processing, Speech Recognition, etc. The powerful learning ability of deep CNN is largely due to the use of multiple feature extraction stages that can automatically learn representations from the data. Availability of a large amount of data and improvements in the hardware technology have accelerated the research in CNNs, and recently very interesting deep CNN architectures have been reported. In fact, several interesting ideas to bring advancements in CNNs have been explored such as the use of different activation and loss functions, parameter optimization, regularization, and architectural innovations. However, the major improvement in representational capacity of the deep CNN is achieved through architectural innovations. Especially, the idea of exploiting spatial and channel information, depth and width of architecture, and multi-path information processing has gained substantial attention. Similarly, the idea of using a block of layers as a structural unit is also gaining popularity. This survey thus focuses on the intrinsic taxonomy present in the recently reported deep CNN architectures and consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature-map exploitation, channel boosting, and attention. Additionally, the elementary understanding of CNN components, current challenges and applications of CNN are also provided. CNNs are the best among learning algorithms in understanding images content, and have shown exemplary results in segmentation, classification, detection, and retrieval related tasks [8], [9]. The success of CNNs has captured attention beyond academia. In industry, companies such as Google, Microsoft, AT&T, NEC, and Facebook have developed active research groups for exploring new architectures of CNN [10]. At present, most of the frontrunners of image processing and computer vision competitions are employing deep CNN based models.The attractive feature of CNN is its ability to exploit spatial or time correlation of the data. The topology of CNN is divided into multiple learning stages composed of a combination of the convolutional layers, non-linear processing units, and subsampling layers [11]. CNNs are feedforward multilayered hierarchical networks that are similar to fully connected neural network where each layer, using a bank of convolutional kernels, performs multiple transformations [12]. Convolution operation extracts useful features from locally correlated data points. Output of the convolutional kernels is assigned to non-linear processing unit (activation function), which not only helps in learning abstractions but also emb...

show abstract

Section: Future Directionsmentioning

confidence: 99%

A survey of the recent architectures of deep convolutional neural networks

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Assessing the computational and time complexity of training DL models is a challenging task due to various reasons, such as the several existing model architectures, training strategies, and hardware environments. However, in order to better understand what methods can be used to reduce these values, recent research is carried out to find a method to predict them [117]. Although regarding monolithic models, it is fairly simple to see that as a result of the reuse of the modules, there are parts which do not have to be trained again, and this results in less computations and reduced training time.…”

Section: Discussionmentioning

confidence: 99%

Deep Learning in Robotics: Survey on Model Structures and Training Strategies

Karoly

Galambos

Kuti

et al. 2021

IEEE Trans. Syst. Man Cybern, Syst.

View full text Add to dashboard Cite

The ever-increasing complexity of robot applications induces the need for methods to approach problems with no (viable) analytical solution. Deep learning (DL) provides a set of tools to address this kind of problems. This survey presents a categorization of the major challenges in robotics that leverage DL technologies and introduces representative examples of successful solutions for the described problems. We also consider the question when and whether to use modular, monolithic models or end-to-end DL, in order to provide a guideline for the selection of the correct model structure and training strategy. By doing so, the current role and adaptability of different techniques at different hierarchical levels of a robot-application can be highlighted, thus providing a well-structured basis to assist future approaches. Index Terms-Deep learning (DL), machine learning (ML), manipulators, mobile robots, neural networks, robot control, robot learning. I. INTRODUCTION C OMPUTERS can easily solve formal problems that are demanding for humans. However, the increasing need for adaptive systems requires the solution of tasks that are hard to formulate, but can be easily solved by humans, such as the recognition and manipulation of objects. In order to perform such tasks, a certain complex knowledge of the environment is inevitable. The automatic extraction of the required knowledge is called machine learning (ML). The way the data is presented to the ML system, heavily influences how well the extracted knowledge represents the given problem. The ML approaches that also perform feature extraction, using multiple hierarchical artificial neural network layers, are referred to as deep learning (DL) [1], [2]. The theoretical background of DL had been introduced for a long time, when it finally gained widespread popularity, i.a., thanks to the winner entry of the ImageNet Challenge Manuscript

show abstract

“…It is evident from the past research work [75] that in any deep learning architecture, as the number of hidden layers increases, so does the number of hyperparameters, thus making the model very complex and requiring much more computational power and execution time to train the model. Compared to this, the proposed model is much more efficient in terms of both computational power and time.…”

Section: Discussionmentioning

confidence: 99%

Hybrid Feature Selection Method Based on Harmony Search and Naked Mole-Rat Algorithms for Spoken Language Identification From Audio Signals

et al. 2020

View full text Add to dashboard Cite

This era is dominated by artificial intelligence and its various applications-one of which is Spoken Language Identification (S-LID) which has always been a challenging issue and an important research area in the domain of speech signal processing. This paper deals with SLID to be used for Human-Computer Interaction (HCI) based applications by attempting to classify various languages from three multilingual databases namely CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages, VoxForge and Indian Institute of Technology, Madras (IIT-Madras) speech corpus database by extracting their Mel-Spectrogram features and Relative Spectral Transform-Perceptual Linear Prediction (RASTA-PLP) features. A new hybrid Feature Selection (FS) algorithm have been developed using the versatile Harmony Search (HS) algorithm and a new nature-inspired algorithm called Naked Mole-Rat (NMR) algorithm to select the best subset of features and reduce the model complexity to help it train faster. This selected feature set is fed to five classifiers namely Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Multi-layer Perceptron (MLP), Naïve Bayes (NB) and Random Forest (RF). The evaluation measures used in this paper are precision, recall, f1-score, classification accuracy and number of selected features. An accuracy of 99.89% on CSS10, 98.22% on VoxForge and 99.75% on IIT-Madras speech corpus databases is achieved using RF. Furthermore, the proposed algorithm is found to outperform 15 standard meta-heuristic FS algorithms.

show abstract

Predicting the Computational Cost of Deep Learning Models

Cited by 198 publications

References 12 publications

A survey of the recent architectures of deep convolutional neural networks

A survey of the recent architectures of deep convolutional neural networks

Deep Learning in Robotics: Survey on Model Structures and Training Strategies

Hybrid Feature Selection Method Based on Harmony Search and Naked Mole-Rat Algorithms for Spoken Language Identification From Audio Signals

Contact Info

Product

Resources

About