Introduction. Existing examples of illegal use of computer steganography prove the need for the development of stegananalytical methods and systems as one of the most important areas of cybersecurity. The advantage of machine learning-based stegananalytical methods is their versatility: they do not rely on knowledge of the injection algorithm and can be used to detect a wide range of steganographic methods. However, before being used for detecting steganocontainers, the methods mentioned require training on containers that are determined for sure whether they contain hidden messages or not. On this stage, it is very important to understand how the parameters of containers under investigation, in particular, such a common variant as JPEG images, affect the accuracy of steganalysis. After all, the inconsistency of the source of containers is an open problem of steganalysis leading to significant decrease of accuracy of detecting hidden messages after the classifier is moved from the laboratory to the real world.
The purpose of the work is investigation of influence of the content, size and quality factor of JPEG images to the accuracy of their steganalysis performed by statistical methods based on machine learning.
Results. During the research the following patterns were revealed: 1) the accuracy is better when images with a close percentage of coefficients suitable for DCT concealment are used for training and control, 2) images are classified more accurately when they have a relatively small number of suitable DCT coefficients, 3) with using mixed training samples (by content or parameters) the accuracy of steganalysis deteriorates, 4) decreasing quality factor of JPEG-images leads to increasing the accuracy of their steganalysis, 5) increasing size of images increases the accuracy of their steganalysis, 6) images where desynchronization of blocks took place during preprocessing are classified more accurately, 7) the sequence of the image preprocessing operations affects the accuracy of its steganoanalysis.
Conclusions. For steganography tasks – the choice of JPEG containers, taking into account revealed patterns, makes steganographic hides more resistant to passive attacks. Considering them for tasks of steganalysis allows one to interpret the obtained results more accurately.
Keywords: information security, steganography, stegananalysis, intelligent computer systems, machine learning, detection accuracy.