2022
DOI: 10.3390/electronics11050696
|View full text |Cite
|
Sign up to set email alerts
|

A Multiview Recognition Method of Predefined Objects for Robot Assembly Using Deep Learning and Its Implementation on an FPGA

Abstract: The process of recognizing manufacturing parts in real time requires fast, accurate, small, and low-power-consumption sensors. Here, we describe a method to extract descriptors from several objects observed from a wide range of angles in a three-dimensional space. These descriptors define the dataset, which allows for the training and further validation of a convolutional neural network. The classification is implemented in reconfigurable hardware in an embedded system with an RGB sensor and the processing uni… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2
1
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 26 publications
(33 reference statements)
0
1
0
Order By: Relevance
“…A neural network, fuzzy ARTMAP, conducted the classification stage of the pieces, and the results were highly precise for all combinations. In a more recent application [12], it was applied in a technique to identify objects from several viewing perspectives. A condensed convolutional neural network model, inspired by LENET-5, was employed for the classification phase.…”
Section: Related Workmentioning
confidence: 99%
“…A neural network, fuzzy ARTMAP, conducted the classification stage of the pieces, and the results were highly precise for all combinations. In a more recent application [12], it was applied in a technique to identify objects from several viewing perspectives. A condensed convolutional neural network model, inspired by LENET-5, was employed for the classification phase.…”
Section: Related Workmentioning
confidence: 99%
“…The proposed RTB-MAXP engine and CMB-MAXP engine were implemented for employment in an FPGA-based CNN accelerator. The target model for the CNN was YOLOv4-CSP-S-Leaky designed for object detection [8]. It consists of 108 layers, including 3 × 3 convolution layers, 1 × 1 convolution layers, residual addition layers, concatenation layers, max-pooling layers, and up-sampling layers.…”
Section: Implementationsmentioning
confidence: 99%
“…To minimize computational costs and simplify the model, reducing the size of these feature maps is necessary. The max-pooling technique is employed to achieve this while preserving spatial invariance of distinct features within the feature maps [8,9]. Typically, a window of size 2 × 2 is used in max-pooling operations, ensuring spatial overlap of the maximum values and sampling values along the horizontal and vertical axes every two positions [9].…”
Section: Introductionmentioning
confidence: 99%