2016
DOI: 10.3390/info7020032
|View full text |Cite
|
Sign up to set email alerts
|

Speech Compression

Abstract: Speech compression is a key technology underlying digital cellular communications, VoIP, voicemail, and voice response systems. We trace the evolution of speech coding based on the linear prediction model, highlight the key milestones in speech coding, and outline the structures of the most important speech coding standards. Current challenges, future research directions, fundamental limits on performance, and the critical open problem of speech coding for emergency first responders are all discussed.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
28
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 29 publications
(28 citation statements)
references
References 39 publications
0
28
0
Order By: Relevance
“…AMR is also the codec (more specifically, its wideband version) that is planned for use in U.S. next generation emergency first responder communication systems [22]. For more information on these codecs, the speech coding techniques, and the cellular applications, see Gibson [23].…”
Section: Resultsmentioning
confidence: 99%
See 4 more Smart Citations
“…AMR is also the codec (more specifically, its wideband version) that is planned for use in U.S. next generation emergency first responder communication systems [22]. For more information on these codecs, the speech coding techniques, and the cellular applications, see Gibson [23].…”
Section: Resultsmentioning
confidence: 99%
“…The quality of the spectral match in Figure 3 would appear to be unsatisfactory and so the statistically significant threshold is also somewhat validated; however, it would be useful if something more interpretive or carrying more of a physical implication could be concluded for this D value. Motivated by the variation in the values of the log likelihood ratio across frames, we calculate the percentage of frames that fall below the statistically significant threshold, in between the statistically significant and perceptually significant thresholds, and above the perceptually significant threshold for each G.726 bit rate for the sentence "We were away a year ago" and for the sentence "A lathe is a big tool", and list these values in Tables 1 and 2, respectively, along with the corresponding signal-to-noise ratios in dB and the PESQ-MOS values [23]. Motivated by the variation in the values of the log likelihood ratio across frames, we calculate the percentage of frames that fall below the statistically significant threshold, in between the statistically significant and perceptually significant thresholds, and above the perceptually significant threshold for Entropy 2017, 19, 496 9 of 14 each G.726 bit rate for the sentence "We were away a year ago" and for the sentence "A lathe is a big tool", and list these values in Tables 1 and 2, respectively, along with the corresponding signal-to-noise ratios in dB and the PESQ-MOS values [23].…”
Section: G726: Adaptive Differential Pulse Code Modulationmentioning
confidence: 99%
See 3 more Smart Citations