Abstract. Multimodal registration is a challenging problem in medical imaging due the high variability of tissue appearance under different imaging modalities. The crucial component here is the choice of the right similarity measure. We make a step towards a general learning-based solution that can be adapted to specific situations and present a metric based on a convolutional neural network. Our network can be trained from scratch even from a few aligned image pairs. The metric is validated on intersubject deformable registration on a dataset different from the one used for training, demonstrating good generalization. In this task, we outperform mutual information by a significant margin.
Matching articulated shapes represented by voxel-sets reduces to maximal sub-graph isomorphism when each set is described by a weighted graph. Spectral graph theory can be used to map these graphs onto lower dimensional spaces and match shapes by aligning their embeddings in virtue of their invariance to change of pose. Classical graph isomorphism schemes relying on the ordering of the eigenvalues to align the eigenspaces fail when handling large data-sets or noisy data. We derive a new formulation that finds the best alignment between two congruent K-dimensional sets of points by selecting the best subset of eigenfunctions of the Laplacian matrix. The selection is done by matching eigenfunction signatures built with histograms, and the retained set provides a smart initialization for the alignment problem with a considerable impact on the overall performance. Dense shape matching casted into graph matching reduces then, to point registration of embeddings under orthogonal transformations; the registration is solved using the framework of unsupervised clustering and the EM algorithm. Maximal subset matching of non identical shapes is handled by defining an appropriate outlier class. Experimental results on challenging examples show how the algorithm naturally treats changes of topology, shape variations and different sampling densities.
Abstract-We present a vision based multisensor that is designed for robot interaction with small, soft, and possibly fragile objects. The sensor consists of a rubber membrane, a rectangular frame on which the membrane is mounted and a CCD camera. The entire system is airtight. Based on the observed deformations of the membrane, we determine the contact area, the integral force acting on the membrane, the 3D force distribution over the membrane, and derive properties of the target object by monitoring the evolution of its deformation. We can distinguish between different types of materials, i.e., solid, soft, amorphous, and determine the speed and nature of their deformation. The sensitivity of the sensor can be adjusted by changing the volume of air within the rectangular frame. We achieved a small noise to signal ratio, which allows us to observe small integral forces in the range of 0.5 N to 2.5 N, with an average error of 0.04 N.
Abstract. Automatic localization of multiple anatomical structures in medical images provides important semantic information with potential benefits to diverse clinical applications. Aiming at organ-specific attenuation correction in PET/MR imaging, we propose an efficient approach for estimating location and size of multiple anatomical structures in MR scans. Our contribution is three-fold: (1) we apply supervised regression techniques to the problem of anatomy detection and localization in whole-body MR, (2) we adapt random ferns to produce multidimensional regression output and compare them with random regression forests, and (3) introduce the use of 3D LBP descriptors in multi-channel MR Dixon sequences. The localization accuracy achieved with both fern-and forest-based approaches is evaluated by direct comparison with state of the art atlas-based registration, on ground-truth data from 33 patients. Our results demonstrate improved anatomy localization accuracy with higher efficiency and robustness.
Activity recognition has primarily addressed the identification of either actions or well-defined interactions among objects in a scene. In this work, we extend the scope to the study of workflow monitoring. In a workflow, ordered groups of activities (phases) with different durations take place in a constrained environment and create temporal patterns across the workflow instances. We address the problem of recognizing phases, based on exemplary recordings. We propose to use Workflow-HMMs, a form of HMMs augmented with phase probability variables that model the complete workflow process. This model takes into account the full temporal context which improves on-line recognition of the phases, especially in case of partial labeling. Targeted applications are workflow monitoring in hospitals and factories, where common action recognition approaches are difficult to apply. To avoid interfering with the normal workflow, we capture the activity of a room with a multiple-camera system. Additionally, we propose to rely on real-time low-level features (3D motion flow) to maintain a generic approach. We demonstrate our methods on sequences from medical procedures performed in a mock-up operating room. The sequences follow a complex workflow, containing various alternatives.
A key component to the success of deep learning is the availability of massive amounts of training data. Building and annotating large datasets for solving medical image classification problems is today a bottleneck for many applications. Recently, capsule networks were proposed to deal with shortcomings of Convolutional Neural Networks (ConvNets). In this work, we compare the behavior of capsule networks against ConvNets under typical datasets constraints of medical image analysis, namely, small amounts of annotated data and class-imbalance. We evaluate our experiments on MNIST, Fashion-MNIST and medical (histological and retina images) publicly available datasets. Our results suggest that capsule networks can be trained with less amount of data for the same or better performance and are more robust to an imbalanced class distribution, which makes our approach very promising for the medical imaging community.
International audienceScene flow represents the 3-D motion of points in the scene, just as optical flow is related to their 2-D motion in the images. As opposed to classical methods which compute scene flow from optical flow, we propose to compute it by tracking 3-D points and surface elements (surfels) in a multi-camera setup (at least two cameras are needed). Two methods are proposed: in the first one, the translation of each 3-D point is found by matching the neighborhoods of its 2-D projections in each camera between two time steps; in the second one, the full pose of a surfel is recovered by matching the image of its projection with a texture template attached to the surfel, and visibility changes caused by occlusion or rotation of surfels are handled. Both methods detect lost or untrackable points and surfels. They were designed for real-time execution and can be used for fast extraction of scene flow from multi-camera sequences
Abstract. In this paper we propose an inexact spectral matching algorithm that embeds large graphs on a low-dimensional isometric space spanned by a set of eigenvectors of the graph Laplacian. Given two sets of eigenvectors that correspond to the smallest non-null eigenvalues of the Laplacian matrices of two graphs, we project each graph onto its eigenenvectors. We estimate the histograms of these one-dimensional graph projections (eigenvector histograms) and we show that these histograms are well suited for selecting a subset of significant eigenvectors, for ordering them, for solving the sign-ambiguity of eigenvector computation, and for aligning two embeddings. This results in an inexact graph matching solution that can be improved using a rigid point registration algorithm. We apply the proposed methodology to match surfaces represented by meshes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.