Accelerating deep learning networks in edge computing based on power-efficient and highly parallel FPGA platforms is an important goal. Combined with deep learning theory, an accelerator design method based on the Winograd algorithm for the deep learning object detection model YOLO under the PYNQ architecture is proposed. A Zynq FPGA is used to build the hardware acceleration platform of a YOLO network. The Winograd algorithm is used to improve traditional convolution. In the FPGA, the numerous multiplication operations in the YOLO network are converted into addition operations, reducing the computational complexity of the model. The data of the original model are processed at a low fixed point, reducing the resource consumption of the FPGA. To optimize memory, a buffer pipeline method is proposed, which further improves the efficiency of the designed accelerator. Experiments show that compared with the acceleration of the YOLO model based on GPUs and other FPGA platforms, the proposed method not only optimizes FPGA resource usage but also reduces power consumption to 2.7 W. Additionally, the detection accuracy loss is less than 3%. INDEX TERMS FPGA, deep learning, Winograd, YOLO, buffer pipeline.
We propose and demonstrate a compact tunable lens with high transmittance using a dielectric elastomer sandwiched by transparent conductive liquid. The transparent conductive liquid not only serves as the refractive material of the tunable lens but also works as the compliant electrode of the dielectric elastomer. The overall dimensions of the proposed tunable lens are 16 mm in diameter and 10 mm in height, and the optical transmittance is as high as 92.2% at 380–760 nm. The focal power variation of the tunable lens is − 23.71 D at an actuation voltage of 3.0 kV. The rise and fall times are 60 ms and 185 ms, respectively. The fabrication process of the tunable lens is free of the deposition of opaque compliant electrodes. Such a tunable lens promises a potential solution in various compact imaging systems.
With the development of infrared detection technology and the improvement of military remote sensing needs, infrared object detection networks with low false alarms and high detection accuracy have been a research focus. However, due to the lack of texture information, the false detection rate of infrared object detection is high, resulting in reduced object detection accuracy. To solve these problems, we propose an infrared object detection network named Dual-YOLO, which integrates visible image features. To ensure the speed of model detection, we choose the You Only Look Once v7 (YOLOv7) as the basic framework and design the infrared and visible images dual feature extraction channels. In addition, we develop attention fusion and fusion shuffle modules to reduce the detection error caused by redundant fusion feature information. Moreover, we introduce the Inception and SE modules to enhance the complementary characteristics of infrared and visible images. Furthermore, we design the fusion loss function to make the network converge fast during training. The experimental results show that the proposed Dual-YOLO network reaches 71.8% mean Average Precision (mAP) in the DroneVehicle remote sensing dataset and 73.2% mAP in the KAIST pedestrian dataset. The detection accuracy reaches 84.5% in the FLIR dataset. The proposed architecture is expected to be applied in the fields of military reconnaissance, unmanned driving, and public safety.
The detection of rotated objects is a meaningful and challenging research work. Although the state-of-the-art deep learning models have feature invariance, especially convolutional neural networks (CNNs), their architectures did not specifically design for rotation invariance. They only slightly compensate for this feature through pooling layers. In this study, we propose a novel network, named LPNet, to solve the problem of object rotation. LPNet improves the detection accuracy by combining retina-like log-polar transformation. Furthermore, LPNet is a plug-and-play architecture for object detection and recognition. It consists of two parts, which we name as encoder and decoder. An encoder extracts images which feature in log-polar coordinates while a decoder eliminates image noise in cartesian coordinates. Moreover, according to the movement of center points, LPNet has stable and sliding modes. LPNet takes the single-shot multibox detector (SSD) network as the baseline network and the visual geometry group (VGG16) as the feature extraction backbone network. The experiment results show that, compared with conventional SSD networks, the mean average precision (mAP) of LPNet increased by 3.4% for regular objects and by 17.6% for rotated objects.
A varifocal lens is an important part of optical systems with applications in biomedicine, photography, smartphones, and virtual reality. In this paper, we propose and demonstrate a varifocal liquid lens driven by a conical dielectric elastomer actuator. When the conical dielectric elastomer is subjected to an actuation voltage, the conical dielectric elastomer works as an out-plane actuator and makes the surface curvature of the liquid droplet increase; then the focal length of the proposed varifocal liquid lens changes. The overall dimensions of the proposed varifocal liquid lens are 9.4 mm in diameter and 12.5 mm in height. The focal length tuning range is 15.07 m m ∼ 9.50 m m when the actuation voltage increases from 0 kV to 5.0 kV. The focal power variation of the proposed varifocal liquid lens is 35.5 D. The rise and fall times of the proposed varifocal liquid lens are 215 ms and 293 ms, respectively. The ability of the proposed liquid lens to focus on objects at different distances without any moving parts is demonstrated. The compact varifocal liquid lens driven by the conical dielectric elastomer actuator in the current study has the potential to be used in various compact imaging systems in the future.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.