This is a repository copy of Repeatable determinism using non-random weight initialisations in smart city applications of deep learning.
This is a repository copy of Non-random weight initialisation in deep learning networks for repeatable determinism.
This research demonstrates a method of discriminating the numerical relationships of neural network layer inputs to the layer outputs established from the learnt weights and biases of a neural network's generalisation model. It is demonstrated with a mathematical form of a neural network rather than an image, speech or textual translation application as this provides clarity in the understanding gained from the generalisation model. It is also reliant on the input format but that format is not unlike an image pixel input format and as such the research is applicable to other applications too. The research results have shown that weight and biases can be used to discriminate the mathematical relationships between inputs and make discriminations of what mathematical operators are used between them in the learnt generalisation model. This may be a step towards gaining definitions and understanding for intractable problems that a NeuralNetwork has generalised in a solution. For validating them, or as a mechanism for creating a model used as an alternative to traditional approaches, but derived from a neural network approach as a development tool for solving those problems. The demonstrated method was optimised using learning rate and the number of nodes and in this example achieves a low loss at 7.6e-6, a low Mean Absolute Error at 1e-3 with a high accuracy score of 1.0. But during the experiments a sensitivity to the number of epochs and the use of the random shuffle was discovered, and a comparison with an alternative shuffle using a non-random reordering demonstrated a lower but comparable performance, and is a subject for further research but demonstrated in this "decomposition" class architecture.
This paper presents a non-random weight initialisation scheme for convolutional neural network layers. It builds upon previous work that was limited to perceptron layers, but in that work repeatable determinism was achieved with equality in categorisation accuracy between the established random scheme and a linear ramp non-random scheme.This work however, is in Convolutional layers and are the layers that have been responsible for better than human performance in image recognition. The previous perceptron work found that number range was more important rather than the gradient. However, that was due to the fully connected nature of dense layers. Although, in convolutional layers by contrast, there is an order direction implied, and the weights relate to filters rather than image pixel positions, so the weight initialisation is more complex. However, the paper demonstrates a better performance, over the currently established random schemes with convolutional layers. The proposed method also induces earlier learning through the use of striped forms, and as such has less unlearning of the traditionally speckled random forms.That proposed scheme also provides a higher performing accuracy in a single learning session, with improvements of: 3.35% unshuffled, 2.813% shuffled in the first epoch and 0.521% over the 5 epochs of the model. Of which the first epoch is more relevant as it is the epoch after initialisation. Also the proposed method is repeatable and deterministic, which is also a desirable quality for safety critical applications within image classification. The proposed method is also robust to He initialisation values too, and scored 97.55% accuracy compared to 96.929% accuracy with the Glorot/ Xavier in the traditional random forms, of which the benchmark model was originally optimised with.
This paper describes a machine assistance approach to grading decisions for values that might be missing or need validation, using a mathematical algebraic form of an Expert System, instead of the traditional textual or logic forms and builds a neural network computational graph structure. This Experts System approach is also structured into a neural network like format of: input, hidden and output layers that provide a structured approach to the knowledge-base organization, this provides a useful abstraction for reuse for data migration applications in big data, Cyber and relational databases. The approach is further enhanced with a Bayesian probability tree approach to grade the confidences of value probabilities, instead of the traditional grading of the rule probabilities, and estimates the most probable value in light of all evidence presented. This is ground work for a Machine Learning (ML) experts system approach in a form that is closer to a Neural Network node structure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.