A non-rigid map fusion-based direct SLAM method for endoscopic capsule robots

Turan, Mehmet; Almalioglu, Yasin; Araújo, Hélder; Konukoğlu, Ender; Sitti, Metin

doi:10.1007/s41315-017-0036-4

Cited by 67 publications

(41 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They improved EKF and PTAM by using threshold strategies to separate rigid and non-rigid feature points. Mahmoud et al [4][5] [13] exploit and tune a complete and widely used large scale SLAM system named ORB-SLAM [6]. They analyze and proved that ORB-SLAM is also suitable for scope localization in MIS.…”

Section: Introductionmentioning

confidence: 99%

MIS-SLAM: Real-Time Large-Scale Dense Deformable SLAM System in Minimal Invasive Surgery Based on Heterogeneous Computing

Song

Wang

Zhao

et al. 2018

IEEE Robot. Autom. Lett.

107

View full text Add to dashboard Cite

Real-time simultaneously localization and dense mapping is very helpful for providing Virtual Reality and Augmented Reality for surgeons or even surgical robots. In this paper, we propose MIS-SLAM: a complete real-time large scale dense deformable SLAM system with stereoscope in Minimal Invasive Surgery based on heterogeneous computing by making full use of CPU and GPU. Idled CPU is used to perform ORB-SLAM for providing robust global pose. Strategies are taken to integrate modules from CPU and GPU. We solved the key problem raised in previous work, that is, fast movement of scope and blurry images make the scope tracking fail. Benefiting from improved localization, MIS-SLAM can achieve large scale scope localizing and dense mapping in real-time. It transforms and deforms current model and incrementally fuses new observation while keeping vivid texture. In-vivo experiments conducted on publicly available datasets presented in the form of videos demonstrate the feasibility and practicality of MIS-SLAM for potential clinical purpose.

show abstract

Section: Introductionmentioning

confidence: 99%

MIS-SLAM: Real-Time Large-Scale Dense Deformable SLAM System in Minimal Invasive Surgery Based on Heterogeneous Computing

Song

Wang

Zhao

et al. 2018

IEEE Robot. Autom. Lett.

107

View full text Add to dashboard Cite

show abstract

“…Estimating scene depth from monocular images is a fundamental task in computer vision which can be potentially applied in various applications such as autonomous driving [2], Visual SLAM [24]. The main drawback of supervised-based systems is their dependence on costly depth-map annotations.…”

Section: Introductionmentioning

confidence: 99%

Structured Coupled Generative Adversarial Networks for Unsupervised Monocular Depth Estimation

Puscas¹,

Pilzer

et al. 2019

2019 International Conference on 3D Vision (3DV)

View full text Add to dashboard Cite

Inspired by the success of adversarial learning, we propose a new end-to-end unsupervised deep learning framework for monocular depth estimation consisting of two Generative Adversarial Networks (GAN), deeply coupled with a structured Conditional Random Field (CRF) model. The two GANs aim at generating distinct and complementary disparity maps and at improving the generation quality via exploiting the adversarial learning strategy. The deep CRF coupling model is proposed to fuse the generative and discriminative outputs from the dual GAN nets. As such, the model implicitly constructs mutual constraints on the two network branches and between the generator and discriminator. This facilitates the optimization of the whole network for better disparity generation. Extensive experiments on the KITTI, Cityscapes, and Make3D datasets clearly demonstrate the effectiveness of the proposed approach and show superior performance compared to state of the art methods. The code and models are available at https://github.com/mihaipuscas/ 3dv---coupled-crf-disparity. * Equal contribution. † Work performed while at SAP ML Research Berlin CRF Structured Coupling of G and D Structured Coupling of Two BranchesG a < l a t e x i t s h a 1 _ b a s e 6 4 = " N V N n C l + g V N 2 m M C u n S 5 W c s 8 h s UY H w z 8 9 t P T G k e y 0 c z S Z g f 4 V D y k F M 0 V n q 4 7 s Z l k h o m 6 W J R m A p i Y j L 7 m w y 4 Y t S I i S V I F b e 3 E j p C h d T Y d E o 2 B G / 5 5 V X S u q h 6 b t W 7 v 6 z U r / M 4 i n A C p 3 A O H t S g D n f Q g C Z Q G M I z v M K b I 5 w X 5 9 3 5 W L Q W n H z m G P 7 A + f w B B / y N m Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " N V N n C l + g V N 2 m M C u n S 5 W c s 8 h s U O w = " > A A A B 6 n i c b V A 9 S w N B E J 2 L X z F + R S 1 t F o N g F e 5 E i I V F w E L L i O Y D k i P M b f a S J X t 7 x + 6 e E I 7 8 B B s L R W z 9 R X b + G z f J F Z r 4 Y O D x 3 g w z 8 4 J E c G 1 c 9 9 s p r K 1 v b G 4 V t 0 s 7 u 3 v 7 B + X D o 5 a O U 0 V Z k 8 Y i V p 0 A N R N c s q b h R r B O o h h G g W D t Y H w z 8 9 t P T G k e y 0 c z S Z g f 4 V D y k F M 0 V n q 4 7 s Z l k h o m 6 W J R m A p i Y j L 7 m w y 4 Y t S I i S V I F b e 3 E j p C h d T Y d E o 2 B G / 5 5 V X S u q h 6 b t W 7 v 6 z U r / M 4 i n A C p 3 A O H t S g D n f Q g C Z Q G M I z v M K b I 5 w X 5 9 3 5 W L Q W n H z m G P 7 A + f w B B / y N m Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " N V N n C l + g V N 2 m M C u n S 5 W c s 8 h s U O w = " > A A A B 6 n i c b V A 9 S w N B E J 2 L X z F + R S 1 t F o N g F e 5 E i I V F w E L L i O Y D k i P M b f a S J X t 7 x + 6 e E I 7 8 B B s L R W z 9 R X b + G z f J F Z r 4 Y O D x 3 g w z 8 4 J E c G 1 c 9 9 s p r K 1 v b G 4 V t 0 s 7 u 3 v 7 B + X D o 5 a O U 0 V Z k 8 Y i V p 0 A N R N c s q b h R r B O o h h G g W D t Y H w z 8 9 t P T G k e y 0 c z S Z g f 4 V D y k F M 0 V n q 4 7 s Z l k h o m 6 W J R m A p i Y j L 7 m w y 4 Y t S I i S V I F b e 3 E j p C h d T Y d E o 2 B G / 5 5 V X S u q h 6 b t W 7 v 6 z U r / M 4 ...

show abstract

“…Models produce pose estimation between two views from different perspectives parameterized as 6-DoF motion, and depth prediction as a disparity map for a given view. and other medical functions [2]- [19], which are, on the other hand, heavily dependent on a real-time and precise pose estimation capability.…”

Section: Introductionmentioning

confidence: 99%

Unsupervised Odometry and Depth Learning for Endoscopic Capsule Robots

Turan

Örnek²,

Ibrahimli³

et al. 2018

2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Self Cite

View full text Add to dashboard Cite

In the last decade, many medical companies and research groups have tried to convert passive capsule endoscopes as an emerging and minimally invasive diagnostic technology into actively steerable endoscopic capsule robots which will provide more intuitive disease detection, targeted drug delivery and biopsy-like operations in the gastrointestinal(GI) tract. In this study, we introduce a fully unsupervised, realtime odometry and depth learner for monocular endoscopic capsule robots. We establish the supervision by warping view sequences and assigning the re-projection minimization to the loss function, which we adopt in multi-view pose estimation and single-view depth estimation network. Detailed quantitative and qualitative analyses of the proposed framework performed on non-rigidly deformable ex-vivo porcine stomach datasets proves the effectiveness of the method in terms of motion estimation and depth recovery.

show abstract

A non-rigid map fusion-based direct SLAM method for endoscopic capsule robots

Cited by 67 publications

References 40 publications

MIS-SLAM: Real-Time Large-Scale Dense Deformable SLAM System in Minimal Invasive Surgery Based on Heterogeneous Computing

MIS-SLAM: Real-Time Large-Scale Dense Deformable SLAM System in Minimal Invasive Surgery Based on Heterogeneous Computing

Structured Coupled Generative Adversarial Networks for Unsupervised Monocular Depth Estimation

Unsupervised Odometry and Depth Learning for Endoscopic Capsule Robots

Contact Info

Product

Resources

About