2018 IEEE International Conference on Big Data (Big Data) 2018
DOI: 10.1109/bigdata.2018.8622396
|View full text |Cite
|
Sign up to set email alerts
|

Predicting the Computational Cost of Deep Learning Models

Abstract: Deep learning is rapidly becoming a go-to tool for many artificial intelligence problems due to its ability to outperform other approaches and even humans at many problems. Despite its popularity we are still unable to accurately predict the time it will take to train a deep learning network to solve a given problem. This training time can be seen as the product of the training time per epoch and the number of epochs which need to be performed to reach the desired level of accuracy. Some work has been carried … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
129
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 198 publications
(130 citation statements)
references
References 12 publications
1
129
0
Order By: Relevance
“… The learning capacity of CNN is generally enhanced by increasing the size of the network and it can be done in reasonable time with the help of the current advanced hardware technology such as Nvidia DGX-2 supercomputer. However, the training of deep and high capacity architectures is still a significant overhead on memory usage and computational resources [244]- [246]. Consequently, we still require a lot of improvements in hardware technology that can accelerate research in CNNs.…”
Section: Future Directionsmentioning
confidence: 99%
“… The learning capacity of CNN is generally enhanced by increasing the size of the network and it can be done in reasonable time with the help of the current advanced hardware technology such as Nvidia DGX-2 supercomputer. However, the training of deep and high capacity architectures is still a significant overhead on memory usage and computational resources [244]- [246]. Consequently, we still require a lot of improvements in hardware technology that can accelerate research in CNNs.…”
Section: Future Directionsmentioning
confidence: 99%
“…Assessing the computational and time complexity of training DL models is a challenging task due to various reasons, such as the several existing model architectures, training strategies, and hardware environments. However, in order to better understand what methods can be used to reduce these values, recent research is carried out to find a method to predict them [117]. Although regarding monolithic models, it is fairly simple to see that as a result of the reuse of the modules, there are parts which do not have to be trained again, and this results in less computations and reduced training time.…”
Section: Discussionmentioning
confidence: 99%
“…It is evident from the past research work [75] that in any deep learning architecture, as the number of hidden layers increases, so does the number of hyperparameters, thus making the model very complex and requiring much more computational power and execution time to train the model. Compared to this, the proposed model is much more efficient in terms of both computational power and time.…”
Section: Discussionmentioning
confidence: 99%