Image classification has always been a hot research direction in the world, and the emergence of deep learning has promoted the development of this field. Convolutional neural networks (CNNs) have gradually become the mainstream algorithm for image classification since 2012, and the CNN architecture applied to other visual recognition tasks (such as object detection, object localization, and semantic segmentation) is generally derived from the network architecture in image classification. In the wake of these successes, CNN-based methods have emerged in remote sensing image scene classification and achieved advanced classification accuracy. In this review, which focuses on the application of CNNs to image classification tasks, we cover their development, from their predecessors up to recent state-of-the-art (SOAT) network architectures. Along the way, we analyze (1) the basic structure of artificial neural networks (ANNs) and the basic network layers of CNNs, (2) the classic predecessor network models, (3) the recent SOAT network algorithms, (4) comprehensive comparison of various image classification methods mentioned in this article. Finally, we have also summarized the main analysis and discussion in this article, as well as introduce some of the current trends.
With the rapid development of machine learning, its powerful function in the machine vision field is increasingly reflected. The combination of machine vision and robotics to achieve the same precise and fast grasping as that of humans requires high-precision target detection and recognition, location and reasonable grasp strategy generation, which is the ultimate goal of global researchers and one of the prerequisites for the large-scale application of robots. Traditional machine learning has a long history and good achievements in the field of image processing and robot control. The CNN (convolutional neural network) algorithm realizes training of large-scale image datasets, solves the disadvantages of traditional machine learning in large datasets, and greatly improves accuracy, thereby positioning CNNs as a global research hotspot. However, the increasing difficulty of labeled data acquisition limits their development. Therefore, unsupervised learning, self-supervised learning and reinforcement learning, which are less dependent on labeled data, have also undergone rapid development and achieved good performance in the fields of image processing and robot capture. According to the inherent defects of vision, this paper summarizes the research achievements of tactile feedback in the fields of target recognition and robot grasping and finds that the combination of vision and tactile feedback can improve the success rate and robustness of robot grasping. This paper provides a systematic summary and analysis of the research status of machine vision and tactile feedback in the field of robot grasping and establishes a reasonable reference for future research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.