The structural tuning of the convolutional neural network for speaker identification in mel frequency cepstrum coefficients space

Matychenko, Anastasiia D.; Polyakova, Marina V.

doi:10.15276/hait.06.2023.7

Cited by 1 publication

(1 citation statement)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Suppose that for image classification, a CNN with architecture S and parameters P is presynthesized, CNN={S, P} [14,15]. This network has already learned earlier to extract features for solving the problem of image classification.…”

Section: Formulation Of the Problemmentioning

confidence: 99%

Xception transfer learning with early stopping for facial age estimation

Polyakova,

Rogachko,

Nesteriuk

et al. 2024

AAIT

View full text Add to dashboard Cite

The rapid development of deep learning attracts more attention to the analysis of person's face images. Deep learning methodsof facial age estimation are more effective compared to methods based on anthropometric models, models of active appearance, texture models, subspace of aging patterns. However, deep learning networks require more computing power to process images. Pre-trained models do not need a large training set and their training time is less. However, the parameters obtained as a result of transfer learning of the pre-training network significantly affect its efficiency. It is also necessary to take into account the properties of the processed images, in particular, the conditions under which they were obtained.Recently, the facial age estimation is implemented in applications in devices with limited resources of computing, for example, in smartphones. The memory size and power consumption of such applications are limited by the computing power of mobile devices. In addition, when photographing a person's face with a smartphone camera, it is very difficult to ensure the uniform lighting. The aim of the research is reducing the error of facial age estimation from uneven illuminated images by applying an early stopping of transfer learning of the Xception network. The proposed technique of transfer learning includes an early stopping of training, if the improvement of the results is not observed within a certain number of epochs. Then the network weights from the epoch with the lowest validation loss are saved. As a result of the proposed technique applying, the average absolute error of age estimation was about five years from unevenly illuminated test images. A number of parameters of the used in this case Xceptionnetwork is less than that of other deep learning neural networks which solved the age estimation problem. Then applying of the Xception network reduces the resource consumption of devices with limited computing power. Prospects for further research are reducing the unevenness of facial image lighting to decrease the error of age estimation. Also, to reduce the computing resources, it is promising to use fast transforms in the Xception convolutional layers.

show abstract

Section: Formulation Of the Problemmentioning

confidence: 99%