Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1096
|View full text |Cite
|
Sign up to set email alerts
|

Punctuation Prediction Model for Conversational Speech

Abstract: Avaya Conversational Intelligence™ (ACI) is an end-to-end, cloud-based solution for real-time Spoken Language Understanding for call centers. It combines large vocabulary, realtime speech recognition, transcript refinement, and entity and intent recognition in order to convert live audio into a rich, actionable stream of structured events. These events can be further leveraged with a business rules engine, thus serving as a foundation for real-time supervision and assistance applications. After the ingestion, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 39 publications
(23 citation statements)
references
References 12 publications
0
23
0
Order By: Relevance
“…In terms of machine learning models, conditional random field (CRF) has been widely used in earlier studies (Lu and Ng, 2010;Zhang et al, 2013). Lately, the use of deep learning models, such as Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and transformers have also been used (Che et al, 2016b;Gale and Parthasarathy, 2017;Zelasko et al, 2018;Wang et al, 2018) for this task.…”
Section: Introductionmentioning
confidence: 99%
“…In terms of machine learning models, conditional random field (CRF) has been widely used in earlier studies (Lu and Ng, 2010;Zhang et al, 2013). Lately, the use of deep learning models, such as Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and transformers have also been used (Che et al, 2016b;Gale and Parthasarathy, 2017;Zelasko et al, 2018;Wang et al, 2018) for this task.…”
Section: Introductionmentioning
confidence: 99%
“…The ones with plain text outperform the ones with To explore the impact of min_words_cut value to the quality of the result, we performed the experiment on sequenceto-sequence LSTM model with the overlapping of 15 words and min_words_cut ranges from 0 to 15. The outcome shown in Figure 5 indicates that f1-scores peak in the middle range of chunk size (4)(5)(6)(7)(8)(9)(10). It demonstrate that predictions of uppercase and lowercase are stable and independent from min_words_cut.…”
Section: Evaluation On Plain-text Model and Encoded-text Modelmentioning
confidence: 82%
“…As we can see, with the help of Figure 5: F1-score on different min word cut. It peak in the middle range of overlap size (4)(5)(6)(7)(8)(9)(10). Predicting uppercase and lowercase are stable and independent from min word cut, question mark is quite sensitive with this hyper-parameter.…”
Section: Evaluation Metricmentioning
confidence: 99%
“…Although it was originally developed for the localization of amino acid sequences in proteins, the irrelevance of the type of symbol to be located has allowed it to be used in multiple fields, among others, the location of similarities between sequences of words. The Needleman and Wunch algorithm is used today in different domains as a tool for a punctuation prediction model for conversational speech [33], as a support technique for large-scale computerized text analysis in political science [34] and for helping in automatic corpus creation for Wikipedia [35].…”
Section: Literature Reviewmentioning
confidence: 99%
“…Under those conditions, all programs should be delayed by the amount of time that the authors have proposed of 20s. In [33] a time of 15s was proposed but has been increased to 20s in order to cover most of the cases without delaying the broadcast excessively. This solution would allow standard tuners to be used, without the adaptation described in the following paragraph, but there is a certain reticence on the part of broadcasters to implement this type of audio-visual manipulation.…”
Section: ) Erase Time Calculationmentioning
confidence: 99%