Kwang Moo Yi scite author profile

We introduce a novel Deep Network architecture that implements the full feature point handling pipeline, that is, detection, orientation estimation, and feature description. While previous works have successfully tackled each one of these problems individually, we show how to learn to do all three in a unified manner while preserving end-toend differentiability. We then demonstrate that our Deep pipeline outperforms state-of-the-art methods on a number of benchmark datasets, without the need of retraining.

show abstract

The Visual Object Tracking VOT2014 Challenge Results

Kristan

et al. 2015

View full text Add to dashboard Cite

The Visual Object Tracking challenge 2014, VOT2014, aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 38 trackers are presented. The number of tested trackers makes VOT 2014 the largest benchmark on short-term tracking to date. For each participating tracker, a short description is provided in the appendix. Features of the VOT2014 challenge that go beyond its VOT2013 predecessor are introduced: (i) a new VOT2014 dataset with full annotation of targets by rotated bounding boxes and per-frame attribute, (ii) extensions of the VOT2013 evaluation methodology, (iii) a new unit for tracking speed assessment less dependent on the hardware and (iv) the VOT2014 evaluation toolkit that significantly speeds up execution of experiments. The dataset, the evaluation kit as well as the results are publicly available at the challenge website (http://votchallenge.net)

show abstract

Learning to Find Good Correspondences

et al. 2018

View full text Add to dashboard Cite

We develop a deep architecture to learn to find good correspondences for wide-baseline stereo. Given a set of putative sparse matches and the camera intrinsics, we train our network in an end-to-end fashion to label the correspondences as inliers or outliers, while simultaneously using them to recover the relative pose, as encoded by the essential matrix. Our architecture is based on a multi-layer perceptron operating on pixel coordinates rather than directly on the image, and is thus simple and small. We introduce a novel normalization technique, called Context Normalization, which allows us to process each data point separately while embedding global information in it, and also makes the network invariant to the order of the correspondences. Our experiments on multiple challenging datasets demonstrate that our method is able to drastically improve the state of the art with little training data.

show abstract

TILDE: A Temporally Invariant Learned DEtector

et al. 2015

View full text Add to dashboard Cite

We introduce a learning-based approach to detect repeatable keypoints under drastic imaging changes of weather and lighting conditions to which state-of-the-art keypoint detectors are surprisingly sensitive. We first identify good keypoint candidates in multiple training images taken from the same viewpoint. We then train a regressor to predict a score map whose maxima are those points so that they can be found by simple non-maximum suppression. As there are no standard datasets to test the influence of these kinds of changes, we created our own, which we will make publicly available. We will show that our method significantly outperforms the state-of-the-art methods in such challenging conditions, while still achieving state-of-the-art performance on the untrained standard Oxford dataset

show abstract

Image Matching Across Wide Baselines: From Paper to Practice

et al. 2020

View full text Add to dashboard Cite

ACNe: Attentive Context Normalization for Robust Permutation-Equivariant Learning

et al. 2020

View full text Add to dashboard Cite

Detection of Moving Objects with Non-stationary Cameras in 5.8ms: Bringing Motion Detection to Your Mobile Device

Yun

Kim

et al. 2013

View full text Add to dashboard Cite

Detecting moving objects on mobile cameras in real-time is a challenging problem due to the computational limits and the motions of the camera. In this paper, we propose a method for moving object detection on non-stationary cameras running within 5.8 milliseconds (ms) on a PC, and real-time on mobile devices. To achieve real time capability with satisfying performance, the proposed method models the background through dual-mode single Gaussian model (SGM) with age and compensates the motion of the camera by mixing neighboring models. Modeling through dual-mode SGM prevents the background model from being contaminated by foreground pixels, while still allowing the model to be able to adapt to changes of the background. Mixing neighboring models reduces the errors arising from motion compensation and their influences are further reduced by keeping the age of the model. Also, to decrease computation load, the proposed method applies one dualmode SGM to multiple pixels without performance degradation. Experimental results show the computational lightness and the real-time capability of our method on a smart phone with robust detection performances.

show abstract

Convolutional neural network identification of galaxy post-mergers in UNIONS using IllustrisTNG

Bickley

Bottrell

Hani

et al. 2021

View full text Add to dashboard Cite

The Canada-France Imaging Survey (CFIS) will consist of deep, high-resolution r-band imaging over ~5000 square degrees of the sky, representing a first-rate opportunity to identify recently-merged galaxies. Due to the large number of galaxies in CFIS, we investigate the use of a convolutional neural network (CNN) for automated merger classification. Training samples of post-merger and isolated galaxy images are generated from the IllustrisTNG simulation processed with the observational realism code RealSim. The CNN’s overall classification accuracy is 88 percent, remaining stable over a wide range of intrinsic and environmental parameters. We generate a mock galaxy survey from IllustrisTNG in order to explore the expected purity of post-merger samples identified by the CNN. Despite the CNN’s good performance in training, the intrinsic rarity of post-mergers leads to a sample that is only ~6 percent pure when the default decision threshold is used. We investigate trade-offs in purity and completeness with a variable decision threshold and find that we recover the statistical distribution of merger-induced star formation rate enhancements. Finally, the performance of the CNN is compared with both traditional automated methods and human classifiers. The CNN is shown to outperform Gini-M20 and asymmetry methods by an order of magnitude in post-merger sample purity on the mock survey data. Although the CNN outperforms the human classifiers on sample completeness, the purity of the post-merger sample identified by humans is frequently higher, indicating that a hybrid approach to classifications may be an effective solution to merger classifications in large surveys.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kwang Moo Yi

LIFT: Learned Invariant Feature Transform

The Visual Object Tracking VOT2014 Challenge Results

Learning to Find Good Correspondences

TILDE: A Temporally Invariant Learned DEtector

Image Matching Across Wide Baselines: From Paper to Practice

ACNe: Attentive Context Normalization for Robust Permutation-Equivariant Learning

Detection of Moving Objects with Non-stationary Cameras in 5.8ms: Bringing Motion Detection to Your Mobile Device

Convolutional neural network identification of galaxy post-mergers in UNIONS using IllustrisTNG

Contact Info

Product

Resources

About