In this paper, a multiclass image semantic segmentation problem was solved. For analysis, images of the intracytoplasmic sperm injection process were used. For training the neural network, 656 frames were manually labelled. As a result, each pixel of the images was assigned to one of four classes: microinjector, suction micropipette, oolemma, background. An analysis of modern approaches was carried out and the best architecture, encoders, and hyperparameters of the neural network were selected experimentally: the convolutional neural network FPN (feature pyramid network) with the resnext101 encoder having a depth of 101 layers with 32 parallel separable convolutions. The developed neural network model has allowed obtaining the segmentation efficiency of IOU=0.96 at the algorithm speed of 15 frames per second.