Investigation of Spoken-Language Detection and Classification in Broadcasted Audio Content

Kotsakis, Rigas; Matsiola, María; Kalliris, George; Dimoulas, Charalampos

doi:10.3390/info11040211

Cited by 8 publications

(8 citation statements)

References 31 publications

(68 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…How to use free and open-source software from and through the Internet to quickly and easily create sound/audio media spots (see Kotsakis et al 2020;Vryzas et al 2020;Kalliris et al 2019;Matsiola et al 2019;Nicolaou et al 2019;Tsipas, Nikolaos, LazarosVrysis, CharalamposDimoulas, and George Papanikolaou, 2015;Kotsakis et al 2012a;Kotsakis et al 2012b;Aguayo Gonzalez et al 2009;Salmon et al 2008;Dimoulas et al 2000).…”

Section: Lesson Planmentioning

confidence: 99%

Media Studies, Audiovisual Media Communications, and Generations: The Case of Budding Journalists in Radio Courses in Greece

Nicolaou

Matsiola

Karypidou

et al. 2021

Journalism and Media

Self Cite

View full text Add to dashboard Cite

In this article, the quality of media studies education through effective teaching utilizing audiovisual media technologies and audiovisual content (audiovisual media communications) to budding journalists as adult learners (18 years and older) is researched, with results primarily intended for application in radio lessons at all educational levels and disciplines (including adult education). Nowadays, audiovisual media communications play an important role in the modern and visual-centric way of our life, while they require all of us to possess multiple-multimodal skills to have a successful professional practice and career, and especially those who study media studies, such as tomorrow’s new journalists. Data were collected after three interactive teachings with emphasis on educational effectiveness in technology-enhanced learning, through a specially designed written questionnaire with a qualitative and quantitative form (evaluation form), as case study experiments that applied qualitative action research with quasi-experiments. The results (a) confirmed (i) the theory of audiovisual media in education, as well as (ii) the genealogical characteristics and habits of budding journalists as highlighted in basic generational theory, something which appears to be in agreement with findings of previous studies and research; and (b) showed that (i) teaching methodology and educational techniques aimed primarily at adult learners in adult education kept the interest and attention of the budding journalists through the use of such specific educational communication tools as audiovisual media technologies, as well as (ii) sound/audio media, as audiovisual content may hold a significant part in a lecture.

show abstract

Section: Lesson Planmentioning

confidence: 99%

Media Studies, Audiovisual Media Communications, and Generations: The Case of Budding Journalists in Radio Courses in Greece

Nicolaou

Matsiola

Karypidou

et al. 2021

Journalism and Media

Self Cite

View full text Add to dashboard Cite

show abstract

“…Motivated by the results of a previous research on program-adaptive pattern analysis for Voice/Music/Phone [1] and Language Discrimination taxonomies [6], the presented methodology functions as an add-on module towards the formulation of a dynamic Generic Audio Classification Repository. Hence, following already adopted hierarchical classification strategies, new schemes were adapted based on clustering techniques, but also their combination with supervised training methods.…”

Section: Background Work and Problem Definitionmentioning

confidence: 99%

“…However, there are issues regarding the inhomogeneity of labeling meta-data, while in some cases, ground-truth training pairs are difficult to obtain (or are even completely unavailable). Hence, combinations of supervised, semi-supervised and unsupervised data mining algorithms are utilized to serve the specific necessities of various real-world multimedia semantics [1,[4][5][6][7][8][9].…”

Section: Introductionmentioning

confidence: 99%

Extending Radio Broadcasting Semantics through Adaptive Audio Segmentation Automations

Kotsakis

Dimoulas

2022

Knowledge

Self Cite

View full text Add to dashboard Cite

The present paper focuses on adaptive audio detection, segmentation and classification techniques in audio broadcasting content, dedicated mainly to voice data. The suggested framework addresses a real case scenario encountered in media services and especially radio streams, aiming to fulfill diverse (semi-) automated indexing/annotation and management necessities. In this context, aggregated radio content is collected, featuring small input datasets, which are utilized for adaptive classification experiments, without searching, at this point, for a generic pattern recognition solution. Hierarchical and hybrid taxonomies are proposed, firstly to discriminate voice data in radio streams and thereafter to detect single speaker voices, and when this is the case, the experiments proceed into a final layer of gender classification. It is worth mentioning that stand-alone and combined supervised and clustering techniques are tested along with multivariate window tuning, towards the extraction of meaningful results based on overall and partial performance rates. Furthermore, the current work via data augmentation mechanisms contributes to the formulation of a dynamic Generic Audio Classification Repository to be subjected, in the future, to adaptive multilabel experimentation with more sophisticated techniques, such as deep architectures.

show abstract

“…The first LID was designed and explored with acoustic features to build a dictionary of words using speech sound. Prosody features also play a vital role in recognizing language from speech signal [1,2]. Prosodic features like pitch, energy, stress are different in tonal language compared to non-tonal languages [3,4].…”

Section: Literature Surveymentioning

confidence: 99%

“…It is the task to recognize the language of utterance without knowing the details of speaker and language content. It identifies the languages of speech utterance based on only raw signal of speech utterance [1].…”

Section: Introductionmentioning

confidence: 99%

HMM Based Language Identification from Speech Utterances of Popular Indic Languages Using Spectral and Prosodic Features

Sadanandam¹

2021

View full text Add to dashboard Cite

Language identification system (LID) is a system which automatically recognises the languages of short-term duration of unknown utterance of human beings. It recognises the discriminate features and reveals the language of utterance that belongs to. In this paper, we consider concatenated feature vectors of Mel Frequency Cepstral Coefficients (MFCC) and Pitch for designing LID. We design a reference model one for each language using 14-dimensional feature vectors using Hidden Markov model (HMM) then evaluate against all reference models of listed languages. The likelihood value of test sample feature vectors given in the evaluation is considered to decide the language of unknown utterance of test speech sample. In this paper we consider seven Indian languages for the experimental set up and the performance of system is evaluated. The average performance of the system is 89.31% and 90.63% for three states and four states HMM for 3sec test speech utterances respectively and also it is also observed that the system gives significant results with 3sec test speech for four state HMM even though we follow simple procedure.

show abstract

Investigation of Spoken-Language Detection and Classification in Broadcasted Audio Content

Cited by 8 publications

References 31 publications

Media Studies, Audiovisual Media Communications, and Generations: The Case of Budding Journalists in Radio Courses in Greece

Media Studies, Audiovisual Media Communications, and Generations: The Case of Budding Journalists in Radio Courses in Greece

Extending Radio Broadcasting Semantics through Adaptive Audio Segmentation Automations

HMM Based Language Identification from Speech Utterances of Popular Indic Languages Using Spectral and Prosodic Features

Contact Info

Product

Resources

About