A machine vision system (MVS) is a technology that can analyze and recognize still or moving pictures using a computer. It is a branch of computer vision that looks like a security camera but can automatically capture, evaluate, and analyze images. The drawbacks are obvious. In the event of a computer vision system failure, firms must have a team of highly trained people with a thorough understanding of the distinctions. Artificial neural networks with numerous layers between the input and output layers are deep neural networks (DNN). Neurons, synapses, weights, biases, and functions are all part of any neural network, regardless of the kind. Many of the challenges in computer vision revolve around using convolutional neural networks (CNN) to categorize images into predefined categories. Convolutional and pooling layers were utilized to decrease the image’s size before feeding the reduced data to fully connected layers. According to the paper, the MVS-CNN algorithm can analyze a picture and determine the value of various characteristics and objects inside it. It is called convolution when combining two functions to create a third function. It is a fusion of two different datasets. A CNN performs convolution on the input data to build a feature map using a filter or kernel. Using a convolutional neural network, an inverted residual block is introduced as the basic block to balance identification accuracy and processing efficiency. The suggested method’s higher inspection performance is achieved with a huge dataset of photos of faulty and defect-free bottles. The result is obtained from the proposed method, the standard deviation ratio is 83.56%, absolute error ratio is 77.26%, trajectory length difference ratio is 82.35%, source pattern radiation amplitude ratio is 86.25%, classification of accuracy ratio is 83.25%, and finally, overall percentage performance ratio is 90.26%.