Environment Recognition for Digital Audio Forensics Using MPEG-7 and MEL Cepstral Features

Muhammad, Ghulam; Alghathbar, Khalid

doi:10.2478/v10187-011-0032-0

Cited by 13 publications

(4 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…MFCC coefficients together with prosodic parameters are often used in speaker recognition systems [18] which can also be used to check the de-identification performance. However, because of our previous good experience, different types of speech features comprising basic and supplementary spectral properties complemented with supra-segmental parameters were used in this experiment for GMM creation, training, and classification.…”

Section: Applied Methods Of Voice De-identificationmentioning

confidence: 99%

Evaluation of speaker de-identification based on voice gender and age conversion

Přibil

Přibilová

Matoušek

2018

Journal of Electrical Engineering

View full text Add to dashboard Cite

Two basic tasks are covered in this paper. The first one consists in the design and practical testing of a new method for voice de-identification that changes the apparent age and/or gender of a speaker by multi-segmental frequency scale transformation combined with prosody modification. The second task is aimed at verification of applicability of a classifier based on Gaussian mixture models (GMM) to detect the original Czech and Slovak speakers after applied voice deidentification. The performed experiments confirm functionality of the developed gender and age conversion for all selected types of de-identification which can be objectively evaluated by the GMM-based open-set classifier. The original speaker detection accuracy was compared also for sentences uttered by German and English speakers showing language independence of the proposed method.

show abstract

Section: Applied Methods Of Voice De-identificationmentioning

confidence: 99%

Evaluation of speaker de-identification based on voice gender and age conversion

Přibil

Přibilová

Matoušek

2018

Journal of Electrical Engineering

View full text Add to dashboard Cite

show abstract

“…This section investigates the feasibility of MPEG-7 low level audio descriptors (features) in the field of emotion recognition. The main reason of using these descriptors is that they were found to be more efficient than the traditional speech features such as Mel frequency cepstral coefficients (MFCC) and linear predictive cepstral coefficients (LPCC) in many applications such as speaker recognition, environment recognition [32], and musical instrument classification. a) MPEG-7 low level audio descriptor: MPEG-7 features are originally developed for multimedia indexing, which contains both video and audio parts [33].…”

Section: ) Audio-based Emotion Recognitionmentioning

confidence: 99%

Audio–Visual Emotion-Aware Cloud Gaming Framework

Hossain

Muhammad

Song

et al. 2015

IEEE Trans. Circuits Syst. Video Technol.

Self Cite

View full text Add to dashboard Cite

The promising potential and emerging applications of cloud gaming have drawn increasing interest from academia, industry, and the general public. However, providing a highquality gaming experience in the cloud gaming framework is a challenging task because of the tradeoff between resource consumption and player emotion, which is affected by the game screen. We tackle this problem by leveraging emotion-aware screen effects in the cloud gaming framework and combining them with remote display technology. The first stage in the framework is the learning or training stage, which establishes a relationship between screen features and emotions using Gaussian mixture model (GMM) based classifiers. In the operating stage, a linear programming (LP) model provides appropriate screen changes based on the real-time user emotion obtained in the first stage. Our experiments demonstrate the effectiveness of the proposed framework. The results show that our proposed framework can provide a high quality gaming experience while generating an acceptable amount of workload for the cloud server in terms of resource consumption.

show abstract

“…These approaches include: (a) environment-based techniques in which the the frequency spectra are forced through the recording environment, (b) device-based techniques in which the frequency spectra are produced by a recording device, and (c) ENF-based techniques in which the frequency spectra are generated by the power source of the recording device [3]. Although advanced research has been conducted on ENF-based techniques [4], [5] and environmentbased techniques [6], [7], few have explored the application of device-based techniques in real-time forensics [1], [8]. Device-based techniques are based on blind source camera identification in image forensics [9], [10], [11].…”

Section: Introductionmentioning

confidence: 99%

Blind identification of source mobile devices using VoIP calls

Jahanirad

Wahab

Anuar

et al. 2014

2014 Ieee Region 10 Symposium

View full text Add to dashboard Cite

Sources such as speakers and environments from different communication devices produce signal variations that result in interference generated by different communication devices. Despite these convolutions, signal variations produced by different mobile devices leave intrinsic fingerprints on recorded calls, thus allowing the tracking of the models and brands of engaged mobile devices. This study aims to investigate the use of recorded Voice over Internet Protocol calls in the blind identification of source mobile devices. The proposed scheme employs a combination of entropy and mel-frequency cepstrum coefficients to extract the intrinsic features of mobile devices and analyzes these features with a multi-class support vector machine classifier. The experimental results lead to an accurate identification of 10 source mobile devices with an average accuracy of 99.72%.

show abstract

Environment Recognition for Digital Audio Forensics Using MPEG-7 and MEL Cepstral Features

Cited by 13 publications

References 10 publications

Evaluation of speaker de-identification based on voice gender and age conversion

Evaluation of speaker de-identification based on voice gender and age conversion

Audio–Visual Emotion-Aware Cloud Gaming Framework

Blind identification of source mobile devices using VoIP calls

Contact Info

Product

Resources

About