Automatically deriving categories for translation

Kawahara, Hideki; Katayose, Haruhiro; Cheveigné, Alain de; Patterson, Roy D.

doi:10.21437/eurospeech.1999-613

Cited by 100 publications

(15 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…First, the code-books for average log F0 and log energy were created. All log F0 measurements of the training set were extracted using the TEMPO method of Kawahara et al (1999) using 5 ms frame…”

Section: Generation Of Signal-based Labelsmentioning

confidence: 99%

Stress and accent transmission in HMM-based syllable-context very low bit rate speech coding

Cerňak¹,

Lazaridis²,

Garner³

et al. 2014

Interspeech 2014

View full text Add to dashboard Cite

In this paper, we propose a solution to reconstruct stress and accent contextual factors at the receiver of a very low bit-rate speech codec built on recognition/synthesis architecture. In speech synthesis, accent and stress symbols are predicted from the text, which is not available at the receiver side of the speech codec. Therefore, speech signal-based symbols, generated as syllable-level log average F0 and energy acoustic measures, quantized using a scalar quantization, are used instead of accentual and stress symbols for HMM-based speech synthesis. Results from incremental real-time speech synthesis confirmed, that a combination of F0 and energy signal-based symbols can replace their counterparts of text-based binary accent and stress symbols developed for text-to-speech systems. The estimated transmission bit-rate overhead is about 14 bits/second per acoustic measure.

show abstract

Section: Generation Of Signal-based Labelsmentioning

confidence: 99%

Stress and accent transmission in HMM-based syllable-context very low bit rate speech coding

Cerňak¹,

Lazaridis²,

Garner³

et al. 2014

Interspeech 2014

View full text Add to dashboard Cite

show abstract

“…The second F0 estimation of the proposed method needs accuracy for noise-reduced speech. F0 estimation using instantaneous frequency is used as the second, because the F0 estimation based on stability of instantaneous frequency, for example, TEMPO2 proposed by Kawahara et al [3], can estimate accurate F0s. In this paper, TEMPO2 is used.…”

Section: F0 Estimation Based On Instantaneous Frequencymentioning

confidence: 99%

“…Various F0 Estimation methods have been proposed, but the most of these methods have the drawbacks for estimating accurate F0s of target speech in noisy environments. Kawahara et al proposed an F0 estimation method based on stability of instantaneous frequencies [3]. This method can estimate F0s for clean speech accurately, but it has difficulties in noisy environments, especially those below 10 dB signal-to-noise ratio (SNR).…”

Section: Introductionmentioning

confidence: 99%

A fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency

Ishimoto¹,

Unoki²,

Akagi³

2001

7th European Conference on Speech Communication and Technology (Eurospeech 2001)

View full text Add to dashboard Cite

This paper proposes a robust and accurate F0 estimation method for noisy speech. This method uses two different principles:(1) an F0 estimation based on periodicity and harmonicity of instantaneous amplitude for a robust estimation in noisy environments, and (2) an F0 estimation based on stability of instantaneous frequency as an accurate estimation method. The proposed method also uses a comb filter with controllable passbands to combine the two estimation methods. Simulation results showed that: (1) the proposed method can estimate F0s for clean speech as accurate as the method using only instantaneous frequency, (2) the proposed method can robustly estimate F0s for speech with aperiodic noise in comparison with the other methods such as the cepstrum method, and (3) the proposed method had the capability of estimating F0s for speech with periodic noise.

show abstract

“…1. Для известных алгоритмов RAPT, YIN, SWIPE', SHS, AC-P, AC-S, ANAL, CC, CEP, ESRPD, SHR, TEMPO [4][5][6][7][23][24][25][26][27][28][29] и сингулярного оценивания ЧОТ (SEPT -Singular Estimation Pitch Tracking) рассматривался процент грубых ошибок GPE (gross pitch errors) [9]. Величина GPE показывает отношение количества анализируемых фреймов с отклонением полученной оценки ЧОТ более чем на ±20% от реального значения ЧОТ к общему числу вокализированных фреймов:…”

unclassified

“…Если принять, что реализация всех алгоритмов выполнена в соответствии с их оригинальным описанием [4][5][6][7][23][24][25][26][27][28][29], то при использовании идентичных входных данных (речевых фрагментов из выбранных баз) и единого аппаратного обеспечения (ПК на базе Intel i5 3.1GHz) можно считать, что сравнение алгоритмов проводились в идентичных условиях. На первый взгляд, величина GPE показывает степень робастности оценивания ЧОТ, так как, по сути, показывает процент допущенных ошибок каждым алгоритмов в процессе оценивания, но с другой стороны по данной величине можно судить о степени точности оценки ЧОТ.…”

unclassified

Software Implementation of a Singular Meter of the Pitch Frequency of a Speech Signal

Вольф¹,

Мещеряков²

2015

Тр. СПИИРАН

View full text Add to dashboard Cite

Модель и программная реализация сингулярного оценивания частоты основного тона речевого сигнала. Аннотация. В статье рассматривается сингулярная модель оценивания частоты основного тона речевого сигнала, а также ее программная реализация. Применение модели сингулярного оценивания частоты основного тона позволяет уменьшить вычислительную сложность алгоритмов анализа речевого сигнала путем аппроксимации края сингулярного спектра и обеспечить меньшее количество ошибок оценивания частоты основного тона за счет использования сингулярной модели вокализированного сегмента речи, учитывающей нестационарные параметры основного тона с помощью собственных чисел. Программная реализация модели используется в модуле расчетов комплекса программ речевой реабилитации онкологических больных после резекции гортани. Ключевые слова: оценивание частоты основного тона речевого сигнала, сингулярный спектральный анализ речи, модель, программная реализация.

show abstract

Automatically deriving categories for translation

Cited by 100 publications

References 3 publications

Stress and accent transmission in HMM-based syllable-context very low bit rate speech coding

Stress and accent transmission in HMM-based syllable-context very low bit rate speech coding

A fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency

Software Implementation of a Singular Meter of the Pitch Frequency of a Speech Signal

Contact Info

Product

Resources

About