1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258) 1999
DOI: 10.1109/icassp.1999.759820
|View full text |Cite
|
Sign up to set email alerts
|

Improving perceptual coding of narrowband audio signals at low rates

Abstract: New applications such as Internet broadcast and communications, consumer multimedia products, digital AM broadcast and satellite networks are emerging. Those applications require moderate audio quality without annoying artifacts at bit rates below 16 kbit/s. Although speech coders provide high speech quality at bit rates around 8 kbit/s, they perform poorly when encoding audio signals. In this thesis, we present a novel transform coding paradigm based on the characteristics of the human hearing system. The pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2005
2005
2013
2013

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 18 publications
(13 citation statements)
references
References 45 publications
0
13
0
Order By: Relevance
“…The psychoacoustic model is taken from [2] and [9] with minor modifications and simplifications. The spreading function and the prediction to find the tonality factor were derived from [17] and applied to the MDCT coefficients as described in the cited reference. For the test set, eight audio files of sampling rate 44.1 kHz were taken from the EBU SQAM [32] database, which included tonal signals, castanets, two singing files and two speech files.…”
Section: Simulation Resultsmentioning
confidence: 99%
See 4 more Smart Citations
“…The psychoacoustic model is taken from [2] and [9] with minor modifications and simplifications. The spreading function and the prediction to find the tonality factor were derived from [17] and applied to the MDCT coefficients as described in the cited reference. For the test set, eight audio files of sampling rate 44.1 kHz were taken from the EBU SQAM [32] database, which included tonal signals, castanets, two singing files and two speech files.…”
Section: Simulation Resultsmentioning
confidence: 99%
“…Hence, the NMR values obtained from the various critical bands are combined into a scalar distortion metric. Two common metrics are: ANMR, which is the NMR averaged over all the critical bands in the frame, and MNMR, which is the maximum NMR of all the critical bands in a frame [17], [18].…”
Section: A Objective Measures In Audio Codingmentioning
confidence: 99%
See 3 more Smart Citations