Automatic Speech Recognition: Systematic Literature Review

Alharbi, Sadeen; Al-Razgan, Muna; Alrashed, Alanoud; Alnomasi, Turkiayh; Almojel, Raghad; Alharbi, Rimah; Alharbi, Saja; Alturki, Sahar; Alshehri, Fatimah; Almojil, Maha

doi:10.1109/access.2021.3112535

Cited by 60 publications

(34 citation statements)

References 91 publications

(129 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…If participants are collecting language using at-home methods, there will likely be additional sources of noise (e.g., additional voices, background TV, movement) ( 131 ), which affects transcription accuracy and acoustic feature calculation. These methods will require participant training to minimize these effects, as well as various pre-processing software to filter out these interferences ( 132 – 134 ). In the absence of paid-for transcription services, researchers can also utilize Automatic-Speech-Recognition (ASR) to transcribe spoken language into text for easier data processing ( 135 ).…”

Section: State-of-the-art In Pain Methodsmentioning

confidence: 99%

Assessing Pain Research: A Narrative Review of Emerging Pain Methods, Their Technosocial Implications, and Opportunities for Multidisciplinary Approaches

Berger

Baria

2022

Front. Pain Res.

View full text Add to dashboard Cite

Pain research traverses many disciplines and methodologies. Yet, despite our understanding and field-wide acceptance of the multifactorial essence of pain as a sensory perception, emotional experience, and biopsychosocial condition, pain scientists and practitioners often remain siloed within their domain expertise and associated techniques. The context in which the field finds itself today—with increasing reliance on digital technologies, an on-going pandemic, and continued disparities in pain care—requires new collaborations and different approaches to measuring pain. Here, we review the state-of-the-art in human pain research, summarizing emerging practices and cutting-edge techniques across multiple methods and technologies. For each, we outline foreseeable technosocial considerations, reflecting on implications for standards of care, pain management, research, and societal impact. Through overviewing alternative data sources and varied ways of measuring pain and by reflecting on the concerns, limitations, and challenges facing the field, we hope to create critical dialogues, inspire more collaborations, and foster new ideas for future pain research methods.

show abstract

Section: State-of-the-art In Pain Methodsmentioning

confidence: 99%

Assessing Pain Research: A Narrative Review of Emerging Pain Methods, Their Technosocial Implications, and Opportunities for Multidisciplinary Approaches

Berger

Baria

2022

Front. Pain Res.

View full text Add to dashboard Cite

show abstract

“…El vocabulario, la pronunciación y el dialecto son las principales técnicas en las que el PLN se enfoca, por lo que hace énfasis en la cantidad de palabras que deberían incluirse en el vocabulario, los problemas que pueden ocasionar una mala pronunciación de las palabras y el problema del reconocimiento del dialecto de distintas regiones donde manejan el idioma implementado. Por último, el uso del micrófono también es analizado, debido a que es un dispositivo que captura la voz y que incide radicalmente cuando se hacen entrenamientos y pruebas de datos con y sin su uso (Alharbi et al, 2021).…”

Section: Trabajos Relacionadosunclassified

“…El reconocimiento automático de voz es una de las aplicaciones en el área del PLN (Ankit et al, 2016), que tiene como objetivo fundamental la transcripción del habla, que se basa en secuencias de palabras representadas a través de ondas de los audios. (Alharbi et al, 2021) Una conversación comúnmente puede darse entre actores humanos y agentes artificiales, donde la naturaleza del discurso, el tamaño del vocabulario y el ancho de banda son aspectos relevantes y primordiales al momento de entrenar un sistema de Reconocimiento Automático de Voz (RAV) (Alharbi et al, 2021). Además, el RAV considera aspectos del lenguaje natural como semántica, sintaxis, gramática y la fonética, dada la variedad de sonidos del habla que pueden producir los seres humanos, que incluyen el ritmo, el acento, la pronunciación dialéctica, las entonaciones peculiares de una palabra para dar un significado u otro, e incluso las distintas malas pronunciaciones en ciertos fonemas como por ejemplo el rotacismo (Aguiar de Lima y Da Costa-Abreu, 2020).…”

unclassified

Reconocimiento del habla con acento español basado en un modelo acústico

et al. 2022

View full text Add to dashboard Cite

El objetivo del artículo fue generar un modelo reconocimiento automático de voz (RAV) basado en la traducción de la voz humana a texto, siendo considerado una de las ramas de la inteligencia artificial. El análisis de voz permite identificar información sobre la acústica, fonética, sintáctica, semántica de las palabras, entre otros elementos que pueden identificar ambigüedad en términos, errores de pronunciación, sintáctica similar pero semántica diferente, que representan características propias del lenguaje humano. El modelo se centró en el análisis acústico de las palabras, proponiendo la generación de una metodología para reconocimiento acústico a partir de transcripciones del habla de audios que contienen voz humana y se usó la tasa de error por palabra para identificar la precisión del modelo. Los audios son llamadas de emergencia registrados por el Servicio Integrado de Seguridad ECU911. El modelo fue entrenado con la herramienta CMUSphinx para idioma español sin conexión a internet. Los resultados mostraron que la tasa de error por palabra varía en relación a la cantidad de audios; es decir a mayor cantidad de audios menor cantidad de palabras erróneas y mayor exactitud del modelo. La investigación concluyó haciendo énfasis en la duración de cada audio como variable que afecta la precisión del modelo.

show abstract

“…Speech is the major communication technique between people, it is necessary to comprehend, learn to read or write. Speech technology now enables robots to react quickly and correctly by using the voices of people instead of keyboards [2], [4].…”

Section: Introductionmentioning

confidence: 99%

Towards developing impairments arabic speech dataset using deep learning

Shareef

Al-Irhayim

2022

IJEECS

View full text Add to dashboard Cite

<span>The effective and efficient recognition of speech sounds errors for impaired children is important if a defective phonological process is early detected and corrected. This study deals with the topic of classification of speech sound errors in Arabic impairments children when Arabic letters and numbers are incorrectly pronounced. For 18 standard Arabic isolated numerals and characters, we created an impaired children speech recognition system. We utilized the Mel frequency cepstral coefficients throughout the feature extraction step. then deep long short-term memory network recognition phase. We used the developed model with the developed dataset and the classification accuracy was 97.99% and lose 0.18%, additionally, the results have been compared and yielded extremely intriguing results with previously existing recognition rates models.</span>

show abstract

Automatic Speech Recognition: Systematic Literature Review

Cited by 60 publications

References 91 publications

Assessing Pain Research: A Narrative Review of Emerging Pain Methods, Their Technosocial Implications, and Opportunities for Multidisciplinary Approaches

Assessing Pain Research: A Narrative Review of Emerging Pain Methods, Their Technosocial Implications, and Opportunities for Multidisciplinary Approaches

Reconocimiento del habla con acento español basado en un modelo acústico

Towards developing impairments arabic speech dataset using deep learning

Contact Info

Product

Resources

About