Vessel segmentation of retinal images is a key diagnostic capability in ophthalmology. This problem faces several challenges including low contrast, variable vessel size and thickness, and presence of interfering pathology such as micro-aneurysms and hemorrhages. Early approaches addressing this problem employed hand-crafted filters to capture vessel structures, accompanied by morphological post-processing. More recently, deep learning techniques have been employed with significantly enhanced segmentation accuracy. We propose a novel domain enriched deep network that consists of two components: 1) a representation network that learns geometric features specific to retinal images, and 2) a custom designed computationally efficient residual task network that utilizes the features obtained from the representation layer to perform pixel-level segmentation. The representation and task networks are jointly learned for any given training set. To obtain physically meaningful and practically effective representation filters, we propose two new constraints that are inspired by expected prior structure on these filters: 1) orientation constraint that promotes geometric diversity of curvilinear features, and 2) a data adaptive noise regularizer that penalizes false positives. Multi-scale extensions are developed to enable accurate detection of thin vessels. Experiments performed on three challenging benchmark databases under a variety of training scenarios show that the proposed prior guided deep network outperforms state of the art alternatives as measured by common evaluation metrics, while being more economical in network size and inference time.
Abstract. There is a worldwide effort to apply 21st century intelli-gence to evolving our transportation networks. The goals of smart transportation networks are quite noble and manifold, including safety, efficiency, law enforcement, energy conservation, and emis-sion reduction. Computer vision is playing a key role in this transpor-tation evolution. Video imaging scientists are providing intelligent sensing and processing technologies for a wide variety of applications and services. There are many interesting technical challenges includ-ing imaging under a variety of environmental and illumination condi-tions, data overload, recognition and tracking of objects at high speed, distributed network sensing and processing, energy sources, as well as legal concerns. This paper presents a survey of computer vision techniques related to three key problems in the transportation domain: safety, efficiency, and security and law enforcement. A broad review of the literature is complemented by detailed treatment of a few selected algorithms and systems that the authors believe represent the state-of-the-art. © The Authors. Published by SPIE under a Creative Commons Attribution 3.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI. [DOI: 10.1117/1.JEI.22.4.041121]
Many automated driver monitoring technologies have been proposed to enhance vehicle and road safety. Most existing solutions involve the use of specialized embedded hardware, primarily in high-end automobiles. This paper explores driver assistance methods that can be implemented on mobile devices such as a consumer smartphone, thus offering a level of safety enhancement that is more widely accessible. Specifically, the paper focuses on estimating driver gaze direction as an indicator of driver attention. Input video frames from a smartphone camera facing the driver are first processed through a coarse head pose direction. Next, the locations and scales of face parts, namely mouth, eyes, and nose, define a feature descriptor that is supplied to an SVM gaze classifier which outputs one of 8 common driver gaze directions. A key novel aspect is an in-situ approach for gathering training data that improves generalization performance across drivers, vehicles, smartphones, and capture geometry. Experimental results show that a high accuracy of gaze direction estimation is achieved for four scenarios with different drivers, vehicles, smartphones and camera locations.
We propose a novel approach to segment hand regions in egocentric video that requires no manual labeling of training samples. The user wearing a head-mounted camera is prompted to perform a simple gesture during an initial calibration step. A combination of color and motion analysis that exploits knowledge of the expected gesture is applied on the calibration video frames to automatically label hand pixels in an unsupervised fashion. The hand pixels identified in this manner are used to train a statisticalmodel-based hand detector. Superpixel region growing is used to perform segmentation refinement and improve robustness to noise. Experiments show that our hand detection technique based on the proposed on-the-fly training approach significantly outperforms state-of-the-art techniques with respect to accuracy and robustness on a variety of challenging videos. This is due primarily to the fact that training samples are personalized to a specific user and environmental conditions. We also demonstrate the utility of our hand detection technique to inform an adaptive video sampling strategy that improves both computational speed and accuracy of egocentric action recognition algorithms. Finally, we offer an egocentric video dataset of an insulin self-injection procedure with action labels and hand masks that can serve towards future research on both hand detection and egocentric action recognition.
We introduce a new model for building conditional generative models in a semisupervised setting to conditionally generate data given attributes by adapting the GAN framework. The proposed semi-supervised GAN (SS-GAN) model uses a pair of stacked discriminators to learn the marginal distribution of the data, and the conditional distribution of the attributes given the data respectively. In the semi-supervised setting, the marginal distribution (which is often harder to learn) is learned from the labeled + unlabeled data, and the conditional distribution is learned purely from the labeled data. Our experimental results demonstrate that this model performs significantly better compared to existing semi-supervised conditional GAN models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.