“…The input data of ResNet are 2D matrices [N UU × N B ], so the proposed architecture is as follows: (i) initially we have a residual layer composed of a convolutional layer 2D (conv2D), for feature extraction, followed by a batch normalization layer, which aims to make the network faster and more stable during the normalization process, and then the activation function rectified linear unit (ReLu). Then, we have another layer conv2D followed by a layer batch normalization, now, however, we add a Add, H(x) = F(x) + x, which has, in order to calculate the residual of the network, which really must be learned compared to what was already known from the input data, F(x) = H(x) − x, where F(x) is mapping of the learnable layers and x are the input data [34]. Finishing with the ReLu activation function; (ii) the next layer is a max pooling 2D, which aims to reduce the dimensionality of the layer's input data and allow assumptions about the resources contained in the clustered sub-regions [34]; (iii) a second residual layer is applied, where we have a conv2D layer followed by a batch normalization and a ReLu, sequentially another conv2D and batch normalization.…”