Purpose Automatic segmentation and classification of surgical activity is crucial for providing advanced support in computer-assisted interventions and autonomous functionalities in robot-assisted surgeries. Prior works have focused on recognizing either coarse activities, such as phases, or fine-grained activities, such as gestures. This work aims at jointly recognizing two complementary levels of granularity directly from videos, namely phases and steps. Methods We introduce two correlated surgical activities, phases and steps, for the laparoscopic gastric bypass procedure. We propose a multi-task multi-stage temporal convolutional network (MTMS-TCN) along with a multi-task convolutional neural network (CNN) training setup to jointly predict the phases and steps and benefit from their complementarity to better evaluate the execution of the procedure. We evaluate the proposed method on a large video dataset consisting of 40 surgical procedures (Bypass40). Results We present experimental results from several baseline models for both phase and step recognition on the Bypass40. The proposed MTMS-TCN method outperforms single-task methods in both phase and step recognition by 1-2% in accuracy, precision and recall. Furthermore, for step recognition, MTMS-TCN achieves a superior performance of 3-6% compared to LSTM-based models on all metrics. Conclusion In this work, we present a multi-task multi-stage temporal convolutional network for surgical activity recognition, which shows improved results compared to single-task models on a gastric bypass dataset with multi-level annotations. The proposed method shows that the joint modeling of phases and steps is beneficial to improve the overall recognition of each type of activity.
Obstacle avoidance for DMPs is still a challenging problem. In our previous work, we proposed a framework for obstacle avoidance based on superquadric potential functions to represent volumes. In this work, we extend our previous work to include the velocity of the trajectory in the definition of the potential. Our formulations guarantee smoother behavior with respect to state-of-the-art point-like methods. Moreover, our new formulation allows to obtain a smoother behavior in proximity of the obstacle than when using a static (i.e. velocity independent) potential. We validate our framework for obstacle avoidance in a simulated multirobot scenario and with different real robots: a pick-and-place task for an industrial manipulator and a surgical robot to show scalability; and navigation with a mobile robot in dynamic environment.
This is the first time that an endoscopic tool based on soft materials has been integrated into a surgical robot. The soft endoscopic camera can be easily operated through the da Vinci Research Kit master console, thus increasing the workspace and the dexterity, and without limiting intuitive and friendly use.
In the context of ultrasound (US) guided breast biopsy, image fusion techniques can be employed to track the position of USinvisible lesions previously identified on a pre-operative image. Such methods have to account for the large anatomical deformations resulting from probe pressure during US scanning within the real-time constraint. Although biomechanical models based on the finite element (FE) method represent the preferred approach to model breast behavior, they cannot achieve real-time performances. In this paper we propose to use deep neural networks to learn large deformations occurring in ultrasoundguided breast biopsy and then to provide accurate prediction of lesion displacement in real-time. We train a U-Net architecture on a relatively small amount of synthetic data generated in an offline phase from FE simulations of probe-induced deformations on the breast anatomy of interest. Overall, both training data generation and network training are performed in less than 5 hours, which is clinically acceptable considering that the biopsy can be performed at most the day after the pre-operative scan. The method is tested both on synthetic and on real data acquired on a realistic breast phantom. Results show that our method correctly learns the deformable behavior modelled via FE simulations and is able to generalize to real data, achieving a target registration error comparable to that of FE models, while being about a hundred times faster.
Purpose Although ultrasound (US) images represent the most popular modality for guiding breast biopsy, malignant regions are often missed by sonography, thus preventing accurate lesion localization which is essential for a successful procedure. Biomechanical models can support the localization of suspicious areas identified on a pre-operative image during US scanning since they are able to account for anatomical deformations resulting from US probe pressure. We propose a deformation model which relies on position-based dynamics (PBD) approach to predict the displacement of internal targets induced by probe interaction during US acquisition. Methods The PBD implementation available in NVIDIA FleX is exploited to create an anatomical model capable of deforming online. Simulation parameters are initialized on a calibration phantom under different levels of probe-induced deformations, then they are fine-tuned by minimizing the localization error of a US-visible landmark of a realistic breast phantom. The updated model is used to estimate the displacement of other internal lesions due to probe-tissue interaction. Results The localization error obtained when applying the PBD model remains below 11 mm for all the tumors even for input displacements in the order of 30 mm. This proposed method obtains results aligned with FE models with faster computational performance, suitable for real-time applications. In addition, it outperforms rigid model used to track lesion position in US-guided breast biopsies, at least halving the localization error for all the displacement ranges considered.
Deep Reinforcement Learning (DRL) is emerging as a promising approach to generate adaptive behaviors for robotic platforms. However, a major drawback of using DRL is the data-hungry training regime that requires millions of trial and error attempts, which is impractical when running experiments on robotic systems. To address this issue, we propose a multi-subtask reinforcement learning method where complex tasks are decomposed manually into low-level subtasks by leveraging human domain knowledge. These subtasks can be parametrized as expert networks and learned via existing DRL methods. Trained subtasks can then be composed with a high-level choreographer. As a testbed, we use a pick and place robotic simulator to demonstrate our methodology, and show that our method outperforms an imitation learning-based method and reaches a high success rate compared to an endto-end learning approach. Moreover, we transfer the learned behavior in a different robotic environment that allows us to exploit sim-to-real transfer and demonstrate the trajectories in a real robotic system. Our training regime is carried out using a central processing unit (CPU)-based system, which demonstrates the data-efficient properties of our approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.