Evolutionary algorithms (EAs) and swarm algorithms (SAs) have shown their usefulness in solving combinatorial and NP-hard optimization problems in various research fields. However, in the field of computer vision, related surveys have not been updated during the last decade. In this study, inspired by the recent development of deep neural networks in computer vision, which embed large-scale optimization problems, we first describe a literature survey conducted to compensate for the lack of relevant research in this area. Specifically, applications related to the genetic algorithm and differential evolution from EAs, as well as particle swarm optimization and ant colony optimization from SAs and their variants, are mainly considered in this survey.
In the recent decade, the development of 3D scanners brings the expansion of 3D models, which yields in the increase of demand for developing effective 3D point cloud retrieval methods using only unorganized point clouds instead of mesh data. In this paper, we propose a meshing-free framework for point cloud retrieval by exploiting a bidirectional similarity measurement on local features. Specifically, we first introduce an effective pipeline for keypoint selection by applying principal component analysis to pose normalization and thresholding local similarity of normals. Then, a point cloud based feature descriptor is employed to compute local feature descriptors directly from point clouds. Finally, we propose a bidirectional feature match strategy to handle the similarity measure. Experimental evaluation on a publicly available benchmark demonstrates the effectiveness of our framework and shows it can outperform other alternatives involving state-of-the-art techniques. INDEX TERMS Point cloud retrieval, 3D shape retrieval, bidirectional feature match.
In recent decade, many state-of-the-art algorithms on image classification as well as audio classification have achieved noticeable successes with the development of deep convolutional neural network (CNN). However, most of the works only exploit single type of training data. In this paper, we present a study on classifying bird species by exploiting the combination of both visual (images) and audio (sounds) data using CNN, which has been sparsely treated so far. Specifically, we propose CNN-based multimodal learning models in three types of fusion strategies (early, middle, late) to settle the issues of combining training data cross domains. The advantage of our proposed method lies on the fact that we can utilize CNN not only to extract features from image and audio data (spectrogram) but also to combine the features across modalities. In the experiment, we train and evaluate the network structure on a comprehensive CUB-200-2011 standard data set combing our originally collected audio data set with respect to the data species. We observe that a model which utilizes the combination of both data outperforms models trained with only an either type of data. We also show that transfer learning can significantly increase the classification performance.
Constructing ontology is considerably time consuming process in general. Since there are a vast amount of thesauri currently available, it may be a feasible solution to exploit thesauri, when constructing ontology in a short period of time. This paper designs and implements a XTM (XML Topic Maps) code converter generating XTM coded ontology from an object based thesaurus. It is an extended thesaurus, which enriches the conventional thesauri with user defined associations, a notion of instances and occurrences associated with them. The reason we adopt XTM is that it is a verified and practical methodology to semantically reorganize the conceptual structure of extant web applications with minimal effort. Moreover, since XTM is conceptually similar to our object based thesauri, recommendation and inference mechanism already developed in our system could be easily applied to the generated XTM ontology. To show that the XTM ontology is correct, we also verify it with onto pia Omnigator and Vizigator, the components of Ontopia Knowledge Suite (OKS) tool.
Bold NARANCHIMEG †a) , Chao ZHANG † †b) , and Takuya AKASHI †c) , SUMMARY In recent decade, many state-of-the-art algorithms on image classification as well as audio classification have achieved noticeable successes with the development of deep convolutional neural network (CNN). However, most of the works only exploit single type of training data. In this paper, we present a study on classifying bird species by exploiting the combination of both visual (images) and audio (sounds) data using CNN, which has been sparsely treated so far. Specifically, we propose CNN-based multimodal learning models in three types of fusion strategies (early, middle, late) to settle the issues of combining training data cross domains. The advantage of our proposed method lies on the fact that We can utilize CNN not only to extract features from image and audio data (spectrogram) but also to combine the features across modalities. In the experiment, we train and evaluate the network structure on a comprehensive CUB-200-2011 standard data set combing our originally collected audio data set with respect to the data species. We observe that a model which utilizes the combination of both data outperforms models trained with only an either type of data. We also show that transfer learning can significantly increase the classification performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.