Roberto Marroquin scite author profile

Transtympanic procedures aim at accessing the middle ear structures through a puncture in the tympanic membrane. They require visualization of middle ear structures behind the eardrum. Up to now, this is provided by an oto endoscope. This work focused on implementing a real-time augmented reality based system for robotic-assisted transtympanic surgery. A preoperative computed tomography scan is combined with the surgical video of the tympanic membrane in order to visualize the ossciles and labyrinthine windows which are concealed behind the opaque tympanic membrane. The study was conducted on 5 artificial and 4 cadaveric temporal bones. Initially, a homography framework based on fiducials (6 stainless steel markers on the periphery of the tympanic membrane) was used to register a 3D reconstructed computed tomography image to the video images. Micro/endoscope movements were then tracked using Speeded-Up Robust Features. Simultaneously, a micro-surgical instrument (needle) in the frame was identified and tracked using a Kalman filter. Its 3D pose was also computed using a 3-collinear-point framework. An average initial registration accuracy of 0.21 mm was achieved with a slow propagation error during the 2-minute tracking. Similarly, a mean surgical instrument tip 3D pose estimation error of 0.33 mm was observed. This system is a crucial first step towards keyhole surgical approach to middle and inner ears.

show abstract

Video-based augmented reality combining CT-scan and instrument position data to microscope view in middle ear surgery

Hussain¹,

Lalande²,

Marroquin³

et al. 2020

Sci Rep

View full text Add to dashboard Cite

the aim of the study was to develop and assess the performance of a video-based augmented reality system, combining preoperative computed tomography (ct) and real-time microscopic video, as the first crucial step to keyhole middle ear procedures through a tympanic membrane puncture. Six different artificial human temporal bones were included in this prospective study. Six stainless steel fiducial markers were glued on the periphery of the eardrum, and a high-resolution CT-scan of the temporal bone was obtained. Virtual endoscopy of the middle ear based on this CT-scan was conducted on Osirix software. Virtual endoscopy image was registered to the microscope-based video of the intact tympanic membrane based on fiducial markers and a homography transformation was applied during microscope movements. These movements were tracked using Speeded-Up Robust Features (SURF) method. Simultaneously, a micro-surgical instrument was identified and tracked using a Kalman filter. The 3D position of the instrument was extracted by solving a three-point perspective framework. For evaluation, the instrument was introduced through the tympanic membrane and ink droplets were injected on three middle ear structures. An average initial registration accuracy of 0.21 ± 0.10 mm (n = 3) was achieved with a slow propagation error during tracking (0.04 ± 0.07 mm). The estimated surgical instrument tip position error was 0.33 ± 0.22 mm. The target structures' localization accuracy was 0.52 ± 0.15 mm. The submillimetric accuracy of our system without tracker is compatible with ear surgery.

show abstract

Ontology for a Panoptes building: Exploiting contextual information and a smart camera network

Marroquin

Dubois

Nicolle

2018

View full text Add to dashboard Cite

The contextual information in the built environment is highly heterogeneous, it goes from static information (e.g., information about the building structure) to dynamic information (e.g., user's space-time information, sensors detections and events that occurred). This paper proposes to semantically fuse the building contextual information with data coming from a smart camera network by using ontologies and semantic web technologies. The ontology developed allows interoperability between the different contextual data and enables, without human interaction, real-time event detections to be performed and system reconfigurations. The use of semantic knowledge in multi-camera monitoring systems guarantees the protection of the user's privacy by not sending nor saving any image, just extracting the knowledge from them. This paper presents a new approach to develop a "all-seeing" smart building, where the global system is the first step to attempt to provide Artificial Intelligence (AI) to a building. More details of the system and future works can be found at the following website: http://wisenet.checksem.fr/ .

show abstract

WiseNET: An indoor multi-camera multi-space dataset with contextual information and annotations for people detection and tracking

Marroquin¹,

Dubois²,

Nicolle³

2019

Data in Brief

View full text Add to dashboard Cite

Nowadays, camera networks are part of our every-day life environments, consequently, they represent a massive source of information for monitoring human activities and to propose new services to the building users. To perform human activity monitoring, people must be detected and the analysis has to be done according to the information relative to the environment and the context. Available multi-camera datasets furnish videos with few (or none) information of the environment where the network was deployed. The proposed dataset provides multi-camera multi-space video sets along with the complete contextual information of the environment. The dataset regroups 11 video sets (composed of 62 single videos) recorded using 6 indoor cameras deployed on multiple spaces. The video sets represent more than 1 h of video footage, include 77 people tracks and captured different human actions such as walking around, standing/sitting, motionless, entering/leaving a space and group merging/splitting. Moreover, each video has been manually and automatically annotated to include people detection and tracking meta-information. The automatic people detection annotations were obtained by using different complexity and robustness detectors, from machine learning to state-of-art deep Convolutional Neural Network (CNN) models. Concerning the contextual information, the Industry Foundation Classes (IFC) file that represents the environment's Building Information Modeling (BIM) data is also provided. The BIM/IFC file describes the complete structure of the environment, it's topology and the elements contained in it. To our knowledge, the WiseNET dataset is the first to provide a set of videos along with the complete information of the environment. The WiseNET dataset is publicly available at https://doi.org/10.4121/uuid:c1fb5962-e939-4c51-bfd5-eac6f2935d44, as well as at the project's website http://wisenet.checksem.fr/#/dataset.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.