Why Dose Layer-by-Layer Pre-training Improve Deep Neural Networks Learning?

Seyyedsalehi, Seyyede Zohreh; Seyyedsalehi, Seyyed Ali

doi:10.1007/978-3-030-11479-4_13

Cited by 2 publications

(3 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Specifically, by adding a sigmoid layer on top of a DBN and reusing the generatively trained weights as the initial weights, we can discriminatively train the underlying MLP (Bengio, 2007) via conventional back-propagation-based techniques to converge to a more accurate local optimum. Pre-training differentiates itself from the SSL techniques by finding a proper initial point within the complex search space in an informed way, without modifying the objective function (Erhan, 2010).…”

Section: Deep Belief Networkmentioning

confidence: 99%

An Integrated Deep Network for Cancer Survival Prediction Using Omics Data

Hassanzadeh

Wang

2021

Front. Big Data

View full text Add to dashboard Cite

As a highly sophisticated disease that humanity faces, cancer is known to be associated with dysregulation of cellular mechanisms in different levels, which demands novel paradigms to capture informative features from different omics modalities in an integrated way. Successful stratification of patients with respect to their molecular profiles is a key step in precision medicine and in tailoring personalized treatment for critically ill patients. In this article, we use an integrated deep belief network to differentiate high-risk cancer patients from the low-risk ones in terms of the overall survival. Our study analyzes RNA, miRNA, and methylation molecular data modalities from both labeled and unlabeled samples to predict cancer survival and subsequently to provide risk stratification. To assess the robustness of our novel integrative analytics, we utilize datasets of three cancer types with 836 patients and show that our approach outperforms the most successful supervised and semi-supervised classification techniques applied to the same cancer prediction problems. In addition, despite the preconception that deep learning techniques require large size datasets for proper training, we have illustrated that our model can achieve better results for moderately sized cancer datasets.

show abstract

Section: Deep Belief Networkmentioning

confidence: 99%

An Integrated Deep Network for Cancer Survival Prediction Using Omics Data

Hassanzadeh

Wang

2021

Front. Big Data

View full text Add to dashboard Cite

show abstract

“…Pre-training methods are used to find the initial values of network weights and free the learning process from the local minimums in the middle of the road as a fundamental obstacle in the training process. These methods seek to find an appropriate starting point for network weights and, in addition to facilitating the network training process, also improve the generalizability of the network [69]. In 2006, Hinton proposed the Restrict Boltzmann Machine (RBM) method for pretraining multilayer neural networks to reduce the nonlinear dimension [51].…”

Section: Pre-trainingmentioning

confidence: 99%

“…In 2015, Seyyed Salehi et al introduced the layer-by-layer pre-training method for pretraining Autoencoder Deep Bottleneck Networks to extract the principal components [50]. However, we used a bidirectional version of this method to pre-train DNNs [69]. This method is used to converge fully connected networks with neurons with sigmoid and sigmoid tangent nonlinearity.…”

Section: Pre-trainingmentioning

confidence: 99%

Performance Evaluation of Deep Convolutional Maxout Neural Network in Speech Recognition

Dehghani

Seyyedsalehi

2018

2018 25th National and 3rd International Iranian Conference on Biomedical Engineering (ICBME)

Self Cite

View full text Add to dashboard Cite

In this paper, various structures and methods of Deep Artificial Neural Networks (DNN) will be evaluated and compared for the purpose of continuous Persian speech recognition. One of the first models of neural networks used in speech recognition applications were fully connected Neural Networks (FCNNs) and, consequently, Deep Neural Networks (DNNs). Although these models have better performance compared to GMM / HMM models, they do not have the proper structure to model local spe ech information. Convolutional Neural Network (CNN) is a good option for modeling the local structure of biological signals, including speech signals. Another issue that Deep Artificial Neural Networks face it, is the convergence of networks on training data. The main inhibitor of convergence is the presence of local minima in the process of training. Deep Neural Network Pretraining methods, despite a large amount of computing, are powerful tools for crossing the local minima. But the use of appropriate neuronal models in the network structure seems to be a better solution to this problem. The Rectified Linear Unit neuronal model and the Maxout model are the most suitable neuronal models presented to this date. Several experiments were carried out to evaluate the performance of the methods and structures mentioned. After verifying the proper functioning of these methods, a combination of all models was implemented on the FARSDAT speech database for continuous speech recognition. The results obtained from the experiments show that the combined model (CMDNN) improves the performance of ANNs in speech recognition versus the pre trained fully connected NNs with sigmoid neurons by about 3%.

show abstract

Why Dose Layer-by-Layer Pre-training Improve Deep Neural Networks Learning?

Cited by 2 publications

References 27 publications

An Integrated Deep Network for Cancer Survival Prediction Using Omics Data

An Integrated Deep Network for Cancer Survival Prediction Using Omics Data

Performance Evaluation of Deep Convolutional Maxout Neural Network in Speech Recognition

Contact Info

Product

Resources

About