Neurocomputing 1990
DOI: 10.1007/978-3-642-76153-9_28
|View full text |Cite
|
Sign up to set email alerts
|

Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
467
0
4

Year Published

1997
1997
2016
2016

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 873 publications
(499 citation statements)
references
References 1 publication
1
467
0
4
Order By: Relevance
“…Obviously, binary data for species occurrence did not require normalization. Sigmoid activation functions were used in the hidden layer nodes [f(x)=1/(1−e − x )], whereas softmax activation functions (Bridle, 1990) were used in the output layer nodes. The softmax function scales the neural network outputs so that their sum is 1 and that each output can be regarded as a probability, i.e., in this particular case, as a membership value for each class of ecological status.…”
Section: Methodsmentioning
confidence: 99%
“…Obviously, binary data for species occurrence did not require normalization. Sigmoid activation functions were used in the hidden layer nodes [f(x)=1/(1−e − x )], whereas softmax activation functions (Bridle, 1990) were used in the output layer nodes. The softmax function scales the neural network outputs so that their sum is 1 and that each output can be regarded as a probability, i.e., in this particular case, as a membership value for each class of ecological status.…”
Section: Methodsmentioning
confidence: 99%
“…The genesis and renaissance of ANNs took place within various communities, and papers published during these periods reflected the disciplines involved: biology and cognition; statistical physics; and computer science. But it was not until the early 1990s that a probability-theoretic perspective emerged, which regarded ANNs as being within the framework of statistics [17,18,19,20].…”
Section: Multilayer Perceptronsmentioning
confidence: 99%
“…The network contained 43,257 trainable weights in total. As is standard for classification tasks, the softmax activation function was used at the output layer, with the cross-entropy objective function [1]. The network was trained using online gradient descent (weight updates after every training sequence) with a learning rate of 10 −6 and a momentum of 0.9.…”
Section: Air Freight Datamentioning
confidence: 99%