“…The ordered set of equations, described in Section II in each layer, is summarized in (5)-(10). Our goal is to minimize the error function (4) where is the desired output, is the current output, and is . For each training data set, starting at the input nodes, a forward pass is used to compute the activity levels of all the nodes in the network to obtain the current output .…”