In the article, emphasis is put on the modern artificial neural network structure, which in the literature is known as a deep neural network. Network includes more than one hidden layer and comprises many standard modules with ReLu nonlinear activation function. A learning algorithm includes two standard steps, forward and backward, and its effectiveness depends on the way the learning error is transported back through all the layers to the first layer. Taking into account all the dimensionalities of matrixes and the nonlinear characteristics of ReLu activation function, the problem is very difficult from a theoretic point of view. To implement simple assumptions in the analysis, formal formulas are used to describe relations between the structure of every layer and the internal input vector. In practice tasks, neural networks' internal layer matrixes with ReLu activations function, include a lot of null value of weight coefficients. This phenomenon has a negatives impact on the effectiveness of the learning algorithm convergences. A theoretical analysis could help to build more effective algorithms.
The use of machine learning has increased over the years, especially in the world of molecular data. Generally, the inference of relationships between features is determined by statistical models. The phenotype (observable clinical characteristics) can result from the expression of the genotype (genetic code) or environmental factors. Molecular datasets have limited information, while supporting clinical data is ambiguous. There are no well-established approaches for combining clinical information with genomic repositories. The genomic tests that are available only use molecular data and give physicians a result which can be integrated clinically. In this paper, we present the strategy where clinical data, regardless of its limitations, is combined in one predictive model with molecular features. We predict the risk of malignancy in the thyroid nodules based on the results of fine-needle aspiration biopsy and expression of selected genes. We utilize a Bayesian network (BN) framework to discover relationships between molecular features and assess the impact of added clinical data quality on the performance of the chosen gene set. Bayesian network offering both prognostic and diagnostic perspectives is a perfect non-parametric technique for feature selection, feature extraction, and prediction purposes. We show that certain clinical factors could work as a synthetic feature and provide predictive abilities beyond what genes alone can offer. The experimental results demonstrate a higher performance of predictive models based on molecular and clinical data than when using only molecular data. We also explain why, one should consider the source of clinical data, but be aware of the quality of variables.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.