Mel-spectrogram features for acoustic vehicle detection and speed estimation

Bulatovic, Nikola; Djukanović, Slobodan

doi:10.1109/it54280.2022.9743540

Cited by 14 publications

(4 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The resulting dataset consisted of 22 The neural network architecture is inspired by the 1-dimensional Convolutional Neural Network (1D-CNN) 74 with different audio features extracted from audio clips added to enhance classification 75 . These features include mean Mel Frequency Cepstral Coefficients (MFCCs) 41 , Mean Chromagram 44 , Mean Mel Spectrogram 40 , Mean Spectral Contrast 43 and Mean Tonal Centroid 42 . The features are concatenated into a one-dimensional vector by taking a mean along the time axis for each of the 10-second segments.…”

Section: Audio Neural Networkmentioning

confidence: 99%

“…We then explored the potential of accurately predicting autism from the full audio band of the ADOS assessment. After normalizing the audio recordings and splitting them into 10-second segments, we extracted several acoustic features [40][41][42][43][44] that were then passed through a convolutional neural network (see Figure 2 and Methods). Following standard model hyperparameter tuning procedure, we deployed the 80-20 training validation split.…”

Section: Autism Prediction Using Audio Featuresmentioning

confidence: 99%

See 1 more Smart Citation

Video-Audio Neural Network Ensemble For Comprehensive Screening Of Autism Spectrum Disorder in Young Children

Natraj

Kojovic

Maillart

et al. 2023

Preprint

View full text Add to dashboard Cite

A timely diagnosis of autism is paramount to allow early therapeutic intervention in preschoolers. Deep Learning (DL) tools have been increasingly used to identify specific autistic symptoms, and offer promises for automated detection of autism at an early age. Here, we leverage a multi-modal approach by combining two neural networks trained on video and audio features of semi-standardized social interactions in a sample of 160 children aged 1 to 5 years old. Our ensemble model performs with an accuracy of 82.5% (F1 score: 0.816, Precision: 0.775, Recall: 0.861) for ASD screening. Additional combinations of our model were developed to achieve higher specificity (92.5%, i.e., few false negatives) or sensitivity (90%, i.e. few false positives). Finally, we found a relationship between the neural network modalities and specific audio versus video ASD characteristics, bringing evidence that our neural network implementation was effective in taking into account different features that are currently standardized under the gold standard ASD assessment.

show abstract

Section: Audio Neural Networkmentioning

confidence: 99%

Section: Autism Prediction Using Audio Featuresmentioning

confidence: 99%

Video-Audio Neural Network Ensemble For Comprehensive Screening Of Autism Spectrum Disorder in Young Children

Natraj

Kojovic

Maillart

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…MA is predicted in a supervised fashion from the logmel spectrogram (LMS) of input audio. LMS represents a very popular feature in acoustic classification applications [32] and it proved very reliable in vehicle detection and speed estimation [3], [4], [8].…”

Section: B Acoustic Featuresmentioning

confidence: 99%

“…Large volumes of traffic data enable significant improvements in the performance of transportation, traffic safety and automatic traffic monitoring (TM) [1]. The TM data are used for valuable information extraction [2], which may include vehicle count [3], [4], shape [5], [6], speed [7], [8], acceleration [9], type [10], [11], plate number [12] and may be used to predict road accidents [13], [14].…”

Section: Introductionmentioning

confidence: 99%

An approach to improving sound-based vehicle speed estimation

Bulatovic¹,

Djukanović²

2022

Preprint

Self Cite

View full text Add to dashboard Cite

We consider improving the performance of a recently proposed sound-based vehicle speed estimation method. In the original method, an intermediate feature, referred to as the modified attenuation (MA), has been proposed for both vehicle detection and speed estimation. The MA feature maximizes at the instant of the vehicle's closest point of approach, which represents a training label extracted from video recording of the vehicle's pass by. In this paper, we show that the original labeling approach is suboptimal and propose a method for label correction. The method is tested on the VS10 dataset, which contains 304 audio-video recordings of ten different vehicles. The results show that the proposed label correction method reduces average speed estimation error from 7.39 km/h to 6.92 km/h. If the speed is discretized into 10 km/h classes, the accuracy of correct class prediction is improved from 53.2% to 53.8%, whereas when tolerance of one class offset is allowed, accuracy is improved from 93.4% to 94.3%.

show abstract

Video-Audio Neural Network Ensemble For Comprehensive Screening Of Autism Spectrum Disorder in Young Children

Natraj

Kojovic

Maillart

et al. 2023

Preprint

View full text Add to dashboard Cite

A timely diagnosis of autism is paramount to allow early therapeutic intervention in preschoolers. Deep Learning (DL) tools have been increasingly used to identify specific autistic symptoms, and offer promises for automated detection of autism at an early age. Here, we leverage a multi-modal approach by combining two neural networks trained on video and audio features of semi-standardized social interactions in a sample of 160 children aged 1 to 5 years old. Our ensemble model performs with an accuracy of 82.5\% (F1 score: 0.816, Precision: 0.775, Recall: 0.861) for ASD screening. Additional combinations of our model were developed to achieve higher specificity (92.5\%, i.e., few false negatives) or sensitivity (90\%, i.e. few false positives). Finally, we found a relationship between the neural network modalities and specific audio versus video ASD characteristics, bringing evidence that our neural network implementation was effective in taking into account different features that are currently standardized under the gold standard ASD assessment.

show abstract

Mel-spectrogram features for acoustic vehicle detection and speed estimation

Cited by 14 publications

References 21 publications

Video-Audio Neural Network Ensemble For Comprehensive Screening Of Autism Spectrum Disorder in Young Children

Video-Audio Neural Network Ensemble For Comprehensive Screening Of Autism Spectrum Disorder in Young Children

An approach to improving sound-based vehicle speed estimation

Video-Audio Neural Network Ensemble For Comprehensive Screening Of Autism Spectrum Disorder in Young Children

Contact Info

Product

Resources

About