A medical robotic system for teleoperated laser microsurgery based on a concept we have called "virtual scalpel" is presented in this paper. This system allows surgeries to be safely and precisely performed using a graphics pen directly over a live video from the surgical site. This is shown to eliminate hand-eye coordination problems that affect other microsurgery systems and to make full use of the operator's manual dexterity without requiring extra training. The implementation of this system, which is based on a tablet PC and a new motorized laser micromanipulator offering 1µm aiming accuracy within the traditional line-of-sight 2D operative space, is fully described. This includes details on the system's hardware and software structures and on its calibration process, which is essential for guaranteeing precise matching between a point touched on the live video and the laser aiming point at the surgical site. Together, the new hardware and software structures make both the calibration parameters and the laser aiming accuracy (on any plane orthogonal to the imaging axis) independent of the target distance and of its motions. Automatic laser control based on new intraoperative planning software and safety improvements based on virtual features are also described in this paper, which concludes by presenting results from sets of path following evaluation experiments conducted with 10 different subjects. These demonstrate an error reduction of almost 50% when using the virtual scalpel system versus the traditional laser microsurgery setup, and an 80% error reduction when using the automatic laser control routines, evidencing great improvements in terms of precision and controllability, and suggesting that the technological advances presented herein will lead to a significantly enhanced capacity for treating a variety of internal human pathologies.
Objectives: To assess a new application of artificial intelligence for real-time detection of laryngeal squamous cell carcinoma (LSCC) in both white light (WL) and narrow-band imaging (NBI) videolaryngoscopies based on the You-Only-Look-Once (YOLO) deep learning convolutional neural network (CNN).Study Design: Experimental study with retrospective data. Methods: Recorded videos of LSCC were retrospectively collected from in-office transnasal videoendoscopies and intraoperative rigid endoscopies. LSCC videoframes were extracted for training, validation, and testing of various YOLO models. Different techniques were used to enhance the image analysis: contrast limited adaptive histogram equalization, data augmentation techniques, and test time augmentation (TTA). The best-performing model was used to assess the automatic detection of LSCC in six videolaryngoscopies.Results: Two hundred and nineteen patients were retrospectively enrolled. A total of 624 LSCC videoframes were extracted. The YOLO models were trained after random distribution of images into a training set (82.6%), validation set (8.2%), and testing set (9.2%). Among the various models, the ensemble algorithm (YOLOv5s with YOLOv5m-TTA) achieved the best LSCC detection results, with performance metrics in par with the results reported by other state-of-the-art detection models: 0.66 Precision (positive predicted value), 0.62 Recall (sensitivity), and 0.63 mean Average Precision at 0.5 intersection over union. Tests on the six videolaryngoscopies demonstrated an average computation time per videoframe of 0.026 seconds. Three demonstration videos are provided.Conclusion: This study identified a suitable CNN model for LSCC detection in WL and NBI videolaryngoscopies. Detection performances are highly promising. The limited complexity and quick computational times for LSCC detection make this model ideal for real-time processing.
Twin-to-Twin Transfusion Syndrome (TTTS) is commonly treated with minimally invasive laser surgery in fetoscopy. The inter-foetal membrane is used as a reference to find abnormal anastomoses. Membrane identification is a challenging task due to small field of view of the camera, presence of amniotic liquid, foetus movement, illumination changes and noise. This paper aims at providing automatic and fast membrane segmentation in fetoscopic images. We implemented an adversarial network consisting of two Fully-Convolutional Neural Networks (FCNNs). The former (the segmentor) is a segmentation network inspired by U-Net and integrated with residual blocks, whereas the latter acts as critic and is made only of the encoding path of the segmentor. A dataset of 900 images acquired in 6 surgical cases was collected and labelled to validate the proposed approach. The adversarial networks achieved a median Dice similarity coefficient of 91.91% with Inter-Quartile Range (IQR) of 4.63%, overcoming approaches based on U-Net (82.98%-IQR : 14.41%) and U-Net with residual blocks (86.13%-IQR : 13.63%). Results proved that the proposed architecture could be a valuable and robust solution to assist surgeons in providing membrane identification while performing fetoscopic surgery.
IntroductionFully convoluted neural networks (FCNN) applied to video-analysis are of particular interest in the field of head and neck oncology, given that endoscopic examination is a crucial step in diagnosis, staging, and follow-up of patients affected by upper aero-digestive tract cancers. The aim of this study was to test FCNN-based methods for semantic segmentation of squamous cell carcinoma (SCC) of the oral cavity (OC) and oropharynx (OP).Materials and MethodsTwo datasets were retrieved from the institutional registry of a tertiary academic hospital analyzing 34 and 45 NBI endoscopic videos of OC and OP lesions, respectively. The dataset referring to the OC was composed of 110 frames, while 116 frames composed the OP dataset. Three FCNNs (U-Net, U-Net 3, and ResNet) were investigated to segment the neoplastic images. FCNNs performance was evaluated for each tested network and compared to the gold standard, represented by the manual annotation performed by expert clinicians.ResultsFor FCNN-based segmentation of the OC dataset, the best results in terms of Dice Similarity Coefficient (Dsc) were achieved by ResNet with 5(×2) blocks and 16 filters, with a median value of 0.6559. In FCNN-based segmentation for the OP dataset, the best results in terms of Dsc were achieved by ResNet with 4(×2) blocks and 16 filters, with a median value of 0.7603. All tested FCNNs presented very high values of variance, leading to very low values of minima for all metrics evaluated.ConclusionsFCNNs have promising potential in the analysis and segmentation of OC and OP video-endoscopic images. All tested FCNN architectures demonstrated satisfying outcomes in terms of diagnostic accuracy. The inference time of the processing networks were particularly short, ranging between 14 and 115 ms, thus showing the possibility for real-time application.
Narrow-band imaging (NBI) laryngoscopy is an optical-biopsy technique used for screening and diagnosing cancer of the laryngeal tract, reducing the biopsy risks but at the cost of some drawbacks, such as large amount of data to review to make the diagnosis. The purpose of this paper is to develop a deep-learning-based strategy for the automatic selection of informative laryngoscopic-video frames, reducing the amount of data to process for diagnosis.
Emotion, mood, and stress recognition (EMSR) has been studied in laboratory settings for decades. In particular, physiological signals are widely used to detect and classify affective states in lab conditions. However, physiological reactions to emotional stimuli have been found to differ in laboratory and natural settings. Thanks to recent technological progress (e.g., in wearables) the creation of EMSR systems for a large number of consumers during their everyday activities is increasingly possible. Therefore, datasets created in the wild are needed to insure the validity and the exploitability of EMSR models for real-life applications. In this paper, we initially present common techniques used in laboratory settings to induce emotions for the purpose of physiological dataset creation. Next, advantages and challenges of data collection in the wild are discussed. To assess the applicability of existing datasets to real-life applications, we propose a set of categories to guide and compare at a glance different methodologies used by researchers to collect such data. For this purpose, we also introduce a visual tool called Graphical Assessment of Real-life Application-Focused Emotional Dataset (GARAFED). In the last part of the paper, we apply the proposed tool to compare existing physiological datasets for EMSR in the wild and to show possible improvements and future directions of research. We wish for this paper and GARAFED to be used as guidelines for researchers and developers who aim at collecting affect-related data for real-life EMSR-based applications.
The worldwide implementation of a liver graft pool using marginal livers (ie, grafts with a high risk of technical complications and impaired function or with a risk of transmitting infection or malignancy to the recipient) has led to a growing interest in developing methods for accurate evaluation of graft quality. Liver steatosis is associated with a higher risk of primary nonfunction, early graft dysfunction, and poor graft survival rate. The present study aimed to analyze the value of artificial intelligence (AI) in the assessment of liver steatosis during procurement compared with liver biopsy evaluation. A total of 117 consecutive liver grafts from brain‐dead donors were included and classified into 2 cohorts: ≥30 versus <30% hepatic steatosis. AI analysis required the presence of an intraoperative smartphone liver picture as well as a graft biopsy and donor data. First, a new algorithm arising from current visual recognition methods was developed, trained, and validated to obtain automatic liver graft segmentation from smartphone images. Second, a fully automated texture analysis and classification of the liver graft was performed by machine‐learning algorithms. Automatic liver graft segmentation from smartphone images achieved an accuracy (Acc) of 98%, whereas the analysis of the liver graft features (cropped picture and donor data) showed an Acc of 89% in graft classification (≥30 versus <30%). This study demonstrates that AI has the potential to assess steatosis in a handy and noninvasive way to reliably identify potential nontransplantable liver grafts and to avoid improper graft utilization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.