2021
DOI: 10.3390/electronics10212622
|View full text |Cite
|
Sign up to set email alerts
|

Environmental Sound Recognition on Embedded Systems: From FPGAs to TPUs

Abstract: In recent years, Environmental Sound Recognition (ESR) has become a relevant capability for urban monitoring applications. The techniques for automated sound recognition often rely on machine learning approaches, which have increased in complexity in order to achieve higher accuracy. Nonetheless, such machine learning techniques often have to be deployed on resource and power-constrained embedded devices, which has become a challenge with the adoption of deep learning approaches based on Convolutional Neural N… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 15 publications
(7 citation statements)
references
References 35 publications
0
7
0
Order By: Relevance
“…Usually, we first apply knowledge distillation to improve the performance, then apply network pruning [132], [135] or quantization. Or first, apply network pruning, then quantization [88], [90], [95], [96], [100]. In summary, knowledge distillation should be applied first to ensure performance; then network pruning for model compression; quantization should be placed in the last step because the training of the quantized network is challenging.…”
Section: F Relationship Among Three Model Compression Methodsmentioning
confidence: 99%
“…Usually, we first apply knowledge distillation to improve the performance, then apply network pruning [132], [135] or quantization. Or first, apply network pruning, then quantization [88], [90], [95], [96], [100]. In summary, knowledge distillation should be applied first to ensure performance; then network pruning for model compression; quantization should be placed in the last step because the training of the quantized network is challenging.…”
Section: F Relationship Among Three Model Compression Methodsmentioning
confidence: 99%
“…Various embedded systems have been utilized for this purpose, using both real-time and non-real-time architectures. Relevant examples concern environmental monitoring [159], ecoacoustics (e.g., birds monitoring) [160]- [162] and urban sounds [31]. Smartphones are also increasingly used for similar purposes [163] along with drones [164], [165].…”
Section: B Relevant Work In the Ioautmentioning
confidence: 99%
“…Figure 6 shows the main steps of the YOLOv4-CF + FPGA object recognition model migration and deployment process, which comprise three parts: model format conversion in a TensorFlow environment on the server, the compilation and quantification of a model in a Vitis AI environment, and programming processing of the ZCU104 platform on the embedded FPGA side. In the first part, the Xilinx Vitis AI tool is affected by the technical update iteration speed and version limitation [ 30 ] because it does not support the direct quantification of Keras weight files created in the TensorFlow environment. Thus, this study transforms the weight of the Keras model and network architecture file into a binary protobuf file.…”
Section: Fpga Embedded Platform and Recognition Model Migration And D...mentioning
confidence: 99%