Abstract3D pose estimation is a key component of many important computer vision tasks such as autonomous navigation and 3D scene understanding. Most state-of-the-art approaches to 3D pose estimation solve this problem as a pose-classification problem in which the pose space is discretized into bins and a CNN classifier is used to predict a pose bin. We argue that the 3D pose space is continuous and propose to solve the pose estimation problem in a CNN regression framework with a suitable representation, data augmentation and loss function that captures the geometry of the pose space. Experiments on PASCAL3D+ show that the proposed 3D pose regression approach achieves competitive performance compared to the state-of-the-art.
We study the problem of understanding objects in detail, intended as recognizing a wide array of fine-grained object attributes. To this end, we introduce a dataset of 7,413 airplanes annotated in detail with parts and their attributes, leveraging images donated by airplane spotters and crowdsourcing both the design and collection of the detailed annotations. We provide a number of insights that should help researchers interested in designing fine-grained datasets for other basic level categories. We show that the collected data can be used to study the relation between part detection and attribute prediction by diagnosing the performance of classifiers that pool information from different parts of an object. We note that the prediction of certain attributes can benefit substantially from accurate part detection. We also show that, differently from previous results in object detection, employing a large number of part templates can improve detection accuracy at the expenses of detection speed. We finally propose a coarse-to-fine approach to speed up detection through a hierarchical cascade algorithm. 1 We already introduced a superset of these aircraft images for FGcomp 2013 [28], but without detailed annotations.
Abstract3D pose estimation is a key component of many important computer vision tasks such as autonomous navigation and 3D scene understanding. Most state-of-the-art approaches to 3D pose estimation solve this problem as a pose-classification problem in which the pose space is discretized into bins and a CNN classifier is used to predict a pose bin. We argue that the 3D pose space is continuous and propose to solve the pose estimation problem in a CNN regression framework with a suitable representation, data augmentation and loss function that captures the geometry of the pose space. Experiments on PASCAL3D+ show that the proposed 3D pose regression approach achieves competitive performance compared to the state-of-the-art.
Fracture detection based on image classification is an area of research which has proved to be challenging for the past several decades. This field has gained more attention due to the new challenges posed by voluminous image databases. In this research work, fusion-based classifiers are constructed, which extracts features from the images, use these features to train and test the classifiers for the purpose of detecting fractures in X-Ray images. The various features extracted are Contrast, Homogeneity, Energy, Entropy, Mean, Variance, Standard Deviation, Correlation, Gabor orientation (GO), Markov Random Field (MRF), and intensity gradient direction (IGD). Three classifiers, BPNN, SVM and NB classifiers are used. Using these features and classifiers, three single classifiers and four multiple classifiers were developed. All the classifiers were tested vigorously with the test dataset for evaluating the winner combination of classifiers and features that correctly identifies fractures in a bone image. The performance metrics used are sensitivity, specificity, positive predictive value, negative predictive value, accuracy and execution time. The experimental results showed that usage of fusion classifiers enhances the detection capacity and the combination SVM and BPNN produces the best result.
Current CNN-based algorithms for recovering the 3D pose of an object in an image assume knowledge about both the object category and its 2D localization in the image. In this paper, we relax one of these constraints and propose to solve the task of joint object category and 3D pose estimation from an image assuming known 2D localization. We design a new architecture for this task composed of a feature network that is shared between subtasks, an object categorization network built on top of the feature network, and a collection of category dependent pose regression networks. We also introduce suitable loss functions and a training method for the new architecture. Experiments on the challenging PASCAL3D+ dataset show state-of-the-art performance in the joint categorization and pose estimation task. Moreover, our performance on the joint task is comparable to the performance of state-of-the-art methods on the simpler 3D pose estimation with known object category task.
Sentiment Analysis has been mainly used to understand the judgment of the text. It has been undergoing major provocation and irony detection is considered as one among the most provocations in it. Irony is the unusual way of narrating an information which disagrees the concept which leads to uncertainty. One primary task included by most developers is data preprocessing which includes many techniques like lemmatization, tokenization and stemming. Many researches are done under irony detection which includes many feature extraction techniques. Machine learning classifiers used for these researches are Support Vector Machine (SVM), linear regression, Naïve Bayes, Random Forest and many more. Results of these research works includes accuracy, precision, recall, F-score which can be used to predict the best suited model. In this paper various methodology used in irony text detection for Sentiment Analysis is discussed.
We consider the task of estimating the 3D orientation of an object of known category given an image of the object and a bounding box around it. Recently, CNN-based regression and classification methods have shown significant performance improvements for this task. This paper proposes a new CNN-based approach to monocular orientation estimation that advances the state of the art in four different directions. First, we take into account the Riemannian structure of the orientation space when designing regression losses and nonlinear activation functions. Second, we propose a mixed Riemannian regression and classification framework that better handles the challenging case of nearly symmetric objects. Third, we propose a data augmentation strategy that is specifically designed to capture changes in 3D orientation. Fourth, our approach leads to state-of-the-art results on the PASCAL3D+ dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.