Artificial Neural Network had gained a tremendous attention from researchers particularly because of the architecture of Artificial Neural Network that laid the foundation as a powerful technique in handling problems such as classification, pattern recognition, and data analysis. It is known for its data-driven, self-adaptive, and non-linear capabilities channel that is used in processing at high speed and the ability to learn the solution to a problem from a set of examples. Recently, research in Neural Network training has become a dynamic area of research, with the Multi-Layer Perceptron (MLP) trained with Back-Propagation (BP) was the most popular and been worked on by various researchers. In this study, the performance analysis based on BP training algorithms; gradient descent and gradient descent with momentum, both using the sigmoidal and hyperbolic tangent activation functions, coupled with pre-processing techniques are executed and compared. The Min-Max, Z-Score, and Decimal Scaling preprocessing techniques are analyzed. The simulations results generated from some selected benchmark datasets reveal that preprocessing the data greatly increase the ANN convergence, with Z-Score producing the overall best performance on all datasets.