Abstract:Laparoscopic videos can be affected by different distortions which may impact the performance of surgery and introduce surgical errors. In this work, we propose a framework for automatically detecting and identifying such distortions and their severity using video quality assessment. There are three major contributions presented in this work (i) a proposal for a novel video enhancement framework for laparoscopic surgery; (ii) a publicly available database for quality assessment of laparoscopic videos evaluated… Show more
“…A recent work in 2020 targeted VQA by detecting and identifying distortions and their severity automatically. Here, the authors constructed a laparoscopic video quality database with a set of 200 videos, with five types of distortions and four levels of intensity for this purpose [22]. A distortion-specific classification method was used for each type of distortion such as a fast noise variance estimator with a threshold for noise distortion, statistics of the luminance component of an image for uneven illumination distortion, a saturation analysis (SAN) classifier for smoke distortion, and a perceptual blur index (PBI) with a threshold classifier for blur distortion [22].…”
Section: Related Workmentioning
confidence: 99%
“…Here, the authors constructed a laparoscopic video quality database with a set of 200 videos, with five types of distortions and four levels of intensity for this purpose [22]. A distortion-specific classification method was used for each type of distortion such as a fast noise variance estimator with a threshold for noise distortion, statistics of the luminance component of an image for uneven illumination distortion, a saturation analysis (SAN) classifier for smoke distortion, and a perceptual blur index (PBI) with a threshold classifier for blur distortion [22]. As a replacement for traditional methods that depend on the distortion categories for coefficients modelling to extract specific features from the images, a single deep neural network was proposed to solve the two important problems of distortion classification and quality ranking [41].…”
Section: Related Workmentioning
confidence: 99%
“…The design of a video quality assessment (VQA) system for laparoscopic surgery of superior calibre is highly essential. Very often, laparoscopic videos are obscured by distortions that degrade the visibility of the patient's anatomy and thus, the overall quality of the laparoscopic or robot-assisted procedure [22], [23]. These distortions commonly arise in the form of noise, smoke, uneven illumination, and blur, which are all concomitant artifacts that come with the operation of the surgical equipment involved in minimally invasive surgery [24].…”
Section: Introduction a Laparoscopic Surgerymentioning
Laparoscopic surgery is a surgical procedure performed by inserting narrow tubes into the abdomen without making large incisions in the skin. It is done with the aid of a video camera. Laparoscopic videos are affected by various distortions during surgery which lead to loss of visual quality. Identification of these distortions is the primary requisite in automated video enhancement systems used to classify the distortions correctly and accordingly select the proper algorithm to enhance video quality. In addition to high accuracy, the speed of distortion classification should be high, and the system must consider realtime conditions. This paper aims to address the issues faced by similar methods by developing a fast and accurate deep learning model for distortion classification. The dataset proposed by the ICIP2020 conference challenge was used for training and evaluation of the proposed method. This challenging dataset contains videos that have five types of distortions such as noise, smoke, uneven illumination, defocus blur, and motion blur with four levels of intensity. This paper discusses the proposed solution which received the first prize in the ICIP2020 challenge. The solution utilized a transfer learning approach to transfer representation from the domain of natural images to the domain of laparoscopic videos. We used a pretrained ResNet50 convolutional neural network (CNN) to extract informative features that were mapped by support vector machine (SVM) classifiers to various distortion categories. In this work, the problem of multiple distortions in the same video was formulated as a multi-label distortion classification problem. The approach of transfer learning with decision fusion was applied and was found to outperform other solutions in terms of accuracy (83%), F1 score of a single distortion (94.7%), and F1 score of single and multiple distortions (94.9%). In addition, the proposed solution can run in real time with an inference speed of 20 frames per second (FPS).
“…A recent work in 2020 targeted VQA by detecting and identifying distortions and their severity automatically. Here, the authors constructed a laparoscopic video quality database with a set of 200 videos, with five types of distortions and four levels of intensity for this purpose [22]. A distortion-specific classification method was used for each type of distortion such as a fast noise variance estimator with a threshold for noise distortion, statistics of the luminance component of an image for uneven illumination distortion, a saturation analysis (SAN) classifier for smoke distortion, and a perceptual blur index (PBI) with a threshold classifier for blur distortion [22].…”
Section: Related Workmentioning
confidence: 99%
“…Here, the authors constructed a laparoscopic video quality database with a set of 200 videos, with five types of distortions and four levels of intensity for this purpose [22]. A distortion-specific classification method was used for each type of distortion such as a fast noise variance estimator with a threshold for noise distortion, statistics of the luminance component of an image for uneven illumination distortion, a saturation analysis (SAN) classifier for smoke distortion, and a perceptual blur index (PBI) with a threshold classifier for blur distortion [22]. As a replacement for traditional methods that depend on the distortion categories for coefficients modelling to extract specific features from the images, a single deep neural network was proposed to solve the two important problems of distortion classification and quality ranking [41].…”
Section: Related Workmentioning
confidence: 99%
“…The design of a video quality assessment (VQA) system for laparoscopic surgery of superior calibre is highly essential. Very often, laparoscopic videos are obscured by distortions that degrade the visibility of the patient's anatomy and thus, the overall quality of the laparoscopic or robot-assisted procedure [22], [23]. These distortions commonly arise in the form of noise, smoke, uneven illumination, and blur, which are all concomitant artifacts that come with the operation of the surgical equipment involved in minimally invasive surgery [24].…”
Section: Introduction a Laparoscopic Surgerymentioning
Laparoscopic surgery is a surgical procedure performed by inserting narrow tubes into the abdomen without making large incisions in the skin. It is done with the aid of a video camera. Laparoscopic videos are affected by various distortions during surgery which lead to loss of visual quality. Identification of these distortions is the primary requisite in automated video enhancement systems used to classify the distortions correctly and accordingly select the proper algorithm to enhance video quality. In addition to high accuracy, the speed of distortion classification should be high, and the system must consider realtime conditions. This paper aims to address the issues faced by similar methods by developing a fast and accurate deep learning model for distortion classification. The dataset proposed by the ICIP2020 conference challenge was used for training and evaluation of the proposed method. This challenging dataset contains videos that have five types of distortions such as noise, smoke, uneven illumination, defocus blur, and motion blur with four levels of intensity. This paper discusses the proposed solution which received the first prize in the ICIP2020 challenge. The solution utilized a transfer learning approach to transfer representation from the domain of natural images to the domain of laparoscopic videos. We used a pretrained ResNet50 convolutional neural network (CNN) to extract informative features that were mapped by support vector machine (SVM) classifiers to various distortion categories. In this work, the problem of multiple distortions in the same video was formulated as a multi-label distortion classification problem. The approach of transfer learning with decision fusion was applied and was found to outperform other solutions in terms of accuracy (83%), F1 score of a single distortion (94.7%), and F1 score of single and multiple distortions (94.9%). In addition, the proposed solution can run in real time with an inference speed of 20 frames per second (FPS).
“…By changing the two parameters of the center location of the bright region and its area, we generated four different levels for uneven illumination. Finally, in order to generate smoke, we have used screen blending model [3]. In this technique, real smoke image having a black background is combined with the reference image in such a way that black areas produce no change to the original image while the brighter areas overlay the original ones.…”
Laparoscopic images and videos are often affected by different types of distortion like noise, smoke, blur and nonuniform illumination. Automatic detection of these distortions, followed generally by application of appropriate image quality enhancement methods, is critical to avoid errors during surgery. In this context, a crucial step involves an objective assessment of the image quality, which is a two-fold problem requiring both the classification of the distortion type affecting the image and the estimation of the severity level of that distortion. Unlike existing image quality measures which focus mainly on estimating a quality score, we propose in this paper to formulate the image quality assessment task as a multi-label classification problem taking into account both the type as well as the severity level (or rank) of distortions. Here, this problem is then solved by resorting to a deep neural networks based approach. The obtained results on a laparoscopic image dataset show the efficiency of the proposed approach.
“…The automatic estimation of the quality of a UGC as perceived by human observers is fundamental for a wide range of applications. For example, to discriminate professional and amateur video content on user-generated video distribution platforms [ 1 ], to choose the best sequence among many sequences for sharing in social media [ 2 ], to guide a video enhancement process [ 3 ], and to rank/choose user-generated videos [ 4 , 5 ].…”
Methods for No-Reference Video Quality Assessment (NR-VQA) of consumer-produced video content are largely investigated due to the spread of databases containing videos affected by natural distortions. In this work, we design an effective and efficient method for NR-VQA. The proposed method exploits a novel sampling module capable of selecting a predetermined number of frames from the whole video sequence on which to base the quality assessment. It encodes both the quality attributes and semantic content of video frames using two lightweight Convolutional Neural Networks (CNNs). Then, it estimates the quality score of the entire video using a Support Vector Regressor (SVR). We compare the proposed method against several relevant state-of-the-art methods using four benchmark databases containing user generated videos (CVD2014, KoNViD-1k, LIVE-Qualcomm, and LIVE-VQC). The results show that the proposed method at a substantially lower computational cost predicts subjective video quality in line with the state of the art methods on individual databases and generalizes better than existing methods in cross-database setup.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.