Pavement damage is the main factor affecting road performance. Pavement cracking, a common type of road damage, is a key challenge in road maintenance. In order to achieve an accurate crack classification, segmentation, and geometric parameter calculation, this paper proposes a method based on a deep convolutional neural network fusion model for pavement crack identification, which combines the advantages of the multitarget single-shot multibox detector (SSD) convolutional neural network model and the U-Net model. First, the crack classification and detection model is applied to classify the cracks and obtain the detection confidence. Next, the crack segmentation network is applied to accurately segment the pavement cracks. By improving the feature extraction structure and optimizing the hyperparameters of the model, pavement crack classification and segmentation accuracy were improved. Finally, the length and width (for linear cracks) and the area (for alligator cracks) are calculated according to the segmentation results. Test results show that the recognition accuracy of the pavement crack identification method for transverse, longitudinal, and alligator cracks is 86.8%, 87.6%, and 85.5%, respectively. It is demonstrated that the proposed method can provide the category information for pavement cracks as well as the accurate positioning and geometric parameter information, which can be used directly for evaluating the pavement condition.
Sorting gangue from raw coal is an essential concern in coal mining engineering. Prior to separation, the location and shape of the gangue should be extracted from the raw coal image. Several approaches regarding automatic detection of gangue have been proposed to date; however, none of them is satisfying. Therefore, this paper aims to conduct gangue segmentation using a U-shape fully convolutional neural network (U-Net). The proposed network is trained to segment gangue from raw coal images collected under complex environmental conditions. The probability map outputted by the network was used to obtain the location and shape information of gangue. The proposed solution was trained on a dataset consisting of 54 shortwave infrared (SWIR) raw coal images collected from Datong Coalfield. The performance of the network was tested with six never seen images, achieving an average area under the receiver operating characteristics (AUROC) value of 0.96. The resulting intersection over union (IoU) was on average equal to 0.86. The results show the potential of using deep learning methods to perform gangue segmentation under various conditions.
It is critical for intelligent vehicles to be capable of monitoring the health and well-being of the drivers they transport on a continuous basis. This is especially true in the case of autonomous vehicles. To address the issue, an automatic system is developed for driver’s real emotion recognizer (DRER) using deep learning. The emotional values of drivers in indoor vehicles are symmetrically mapped to image design in order to investigate the characteristics of abstract expressions, expression design principles, and an experimental evaluation is conducted based on existing research on the design of driver facial expressions for intelligent products. By substituting a custom-created CNN features learning block with the base 11 layers CNN model in this paper for the development of an improved faster R-CNN face detector that detects the driver’s face at a high frame per second (FPS). Transfer learning is performed in the NasNet large CNN model in order to recognize the driver’s various emotions. Additionally, a custom driver emotion recognition image dataset is being developed as part of this research task. The proposed model, which is a combination of an improved faster R-CNN and transfer learning in NasNet-Large CNN architecture for DER based on facial images, enables greater accuracy than previously possible for DER based on facial images. The proposed model outperforms some recently updated state-of-the-art techniques in terms of accuracy. The proposed model achieved the following accuracy on various benchmark datasets: JAFFE 98.48%, CK+ 99.73%, FER-2013 99.95%, AffectNet 95.28%, and 99.15% on a custom-developed dataset.
Car crashes are among the top ten leading causes of death; they could mainly be attributed to distracted drivers. An advanced driver-assistance technique (ADAT) is a procedure that can notify the driver about a dangerous scenario, reduce traffic crashes, and improve road safety. The main contribution of this work involved utilizing the driver’s attention to build an efficient ADAT. To obtain this “attention value”, the gaze tracking method is proposed. The gaze direction of the driver is critical toward understanding/discerning fatal distractions, pertaining to when it is obligatory to notify the driver about the risks on the road. A real-time gaze tracking system is proposed in this paper for the development of an ADAT that obtains and communicates the gaze information of the driver. The developed ADAT system detects various head poses of the driver and estimates eye gaze directions, which play important roles in assisting the driver and avoiding any unwanted circumstances. The first (and more significant) task in this research work involved the development of a benchmark image dataset consisting of head poses and horizontal and vertical direction gazes of the driver’s eyes. To detect the driver’s face accurately and efficiently, the You Only Look Once (YOLO-V4) face detector was used by modifying it with the Inception-v3 CNN model for robust feature learning and improved face detection. Finally, transfer learning in the InceptionResNet-v2 CNN model was performed, where the CNN was used as a classification model for head pose detection and eye gaze angle estimation; a regression layer to the InceptionResNet-v2 CNN was added instead of SoftMax and the classification output layer. The proposed model detects and estimates head pose directions and eye directions with higher accuracy. The average accuracy achieved by the head pose detection system was 91%; the model achieved a RMSE of 2.68 for vertical and 3.61 for horizontal eye gaze estimations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.