EndoSLAM dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos

Ozyoruk, Kutsev Bengisu; Gokceler, Guliz Irem; Bobrow, Taylor L.; Coskun, Gulfize; İncetan, Kağan; Almalioglu, Yasin; Mahmood, Faisal; Curto, Eva; Perdigoto, Luís; Oliveira, Marina; Şahin, Hasan; Araújo, Hélder; Alexandrino, Henrique; Durr, Nicholas J.; Gilbert, Hunter B.; Turan, Mehmet

doi:10.1016/j.media.2021.102058

Cited by 123 publications

(80 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In some cases, artificial or anatomical cues can be used as fiducial markers, to perform a point‐based match between the endoscopic organ and the virtual model, since they are visible both preoperatively and intraoperatively 8,33 . On the contrary, surface‐based methods focus on the intraoperative perspective rather than on preoperative data, because the surface is intraoperatively reconstructed directly on laparoscopic images and registered only at a later stage 37,38 . Finally, the volume‐based methodologies are the most complex ones, as they require an intraoperative imaging system in addition to the endoscope, to better locate the hidden structures 39 .…”

Section: Related Workmentioning

confidence: 99%

A deep learning framework for real‐time 3D model registration in robot‐assisted laparoscopic surgery

Padovan

Marullo

Tanzi

et al. 2022

Robotics Computer Surgery

View full text Add to dashboard Cite

Introduction The current study presents a deep learning framework to determine, in real‐time, position and rotation of a target organ from an endoscopic video. These inferred data are used to overlay the 3D model of patient's organ over its real counterpart. The resulting augmented video flow is streamed back to the surgeon as a support during laparoscopic robot‐assisted procedures. Methods This framework exploits semantic segmentation and, thereafter, two techniques, based on Convolutional Neural Networks and motion analysis, were used to infer the rotation. Results The segmentation shows optimal accuracies, with a mean IoU score greater than 80% in all tests. Different performance levels are obtained for rotation, depending on the surgical procedure. Discussion Even if the presented methodology has various degrees of precision depending on the testing scenario, this work sets the first step for the adoption of deep learning and augmented reality to generalise the automatic registration process.

show abstract

Section: Related Workmentioning

confidence: 99%

A deep learning framework for real‐time 3D model registration in robot‐assisted laparoscopic surgery

Padovan

Marullo

Tanzi

et al. 2022

Robotics Computer Surgery

View full text Add to dashboard Cite

show abstract

“…It is a pity that their research was limited to static image recognition, unable to adapt to endoscope videoed in poor light or unknown depth scenes. Ozyoruk et al proposed an unsupervised monocular visual odometry and estimated depth to solve the problem of frequently changing lighting conditions and scale inconsistency between consecutive frames [ 17 ]. The algorithm was optimized by mixed loss functions, using spatial attention modules to instruct the network to focus on tissue areas.…”

Section: Application Of DL In Gastrointestinal Endoscopymentioning

confidence: 99%

A systematic review on application of deep learning in digestive system image processing

2021

View full text Add to dashboard Cite

“…Specular highlights in digital images commonly occur with discrete light sources. They present a serious problem in applications that rely on image processing and analysis, such as depth perception, localization, and 3D reconstruction (Tao et al, 2015;Ozyoruk et al, 2021). These highlights not only occlude important colors, textures, and features, but also act as additional features that may be falsely interpreted as characteristic of the scene.…”

Section: Introductionmentioning

confidence: 99%

“…These highlights also negatively effect the success of numerous MISD computer vision tasks. These tasks include providing a better depth perception, object recognition, motion tracking, 3D reconstruction, localisation, etc (Ozyoruk et al, 2021;Kac ¸maz et al, 2020).…”

Section: Introductionmentioning

confidence: 99%

“…This makes it hard to learn and model such artefacts, which leads to a lack of ground truth data. Since data is a key component of any learning based approach, synthetic data has been developed, but is still not very realistic (Ozyoruk et al, 2021). In addition to that, with the limited real data in general in the medical field, a clear data shortage arises.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A Temporal Learning Approach to Inpainting Endoscopic Specularities and Its effect on Image Correspondence

Daher¹,

Vasconcelos²,

Stoyanov³

2022

Preprint

View full text Add to dashboard Cite

Video streams are utilised to guide minimally-invasive surgery and diagnostic procedures in a wide range of procedures, and many computer assisted techniques have been developed to automatically analyse them. These approaches can provide additional information to the surgeon such as lesion detection, instrument navigation, or anatomy 3D shape modeling. However, the necessary image features to recognise these patterns are not always reliably detected due to the presence of irregular light patterns such as specular highlight reflections. In this paper, we aim at removing specular highlights from endoscopic videos using machine learning. We propose using a temporal generative adversarial network (GAN) to inpaint the hidden anatomy under specularities, inferring its appearance spatially and from neighbouring frames where they are not present in the same location. This is achieved using in-vivo data of gastric endoscopy (Hyper-Kvasir) in a fully unsupervised manner that relies on automatic detection of specular highlights. System evaluations show significant improvements to traditional methods through direct comparison as well as other machine learning techniques through an ablation study that depicts the importance of the network's temporal and transfer learning components. The generalizability of our system to different surgical setups and procedures was also evaluated qualitatively on in-vivo data of gastric endoscopy and ex-vivo porcine data (SERV-CT, SCARED). We also assess the effect of our method in computer vision tasks that underpin 3D reconstruction and camera motion estimation, namely stereo disparity, optical flow, and sparse point feature matching. These are evaluated quantitatively and qualitatively and results show a positive effect of specular highlight inpainting on these tasks in a novel comprehensive analysis.

show abstract

EndoSLAM dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos

Cited by 123 publications

References 40 publications

A deep learning framework for real‐time 3D model registration in robot‐assisted laparoscopic surgery

A deep learning framework for real‐time 3D model registration in robot‐assisted laparoscopic surgery

A systematic review on application of deep learning in digestive system image processing

A Temporal Learning Approach to Inpainting Endoscopic Specularities and Its effect on Image Correspondence

Contact Info

Product

Resources

About