Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-1235
|View full text |Cite
|
Sign up to set email alerts
|

Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-Level Embedding Features

Abstract: This paper describes a conditional neural network architecture for Mandarin Chinese polyphone disambiguation. The system is composed of a bidirectional recurrent neural network component acting as a sentence encoder to accumulate the context correlations, followed by a prediction network that maps the polyphonic character embeddings along with the conditions to corresponding pronunciations. We obtain the word-level condition from a pre-trained word-to-vector lookup table. One goal of polyphone disambiguation i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(16 citation statements)
references
References 17 publications
0
16
0
Order By: Relevance
“…Text Normalization Rule-based [311], Neural-based [310,223,406,430], Hybrid [432] Word Segmentation [394,444,261] POS Tagging [292,323,221,444,135] Prosody Prediction [50,405,312,186,137,322,277,62,440,210,212,3] Grapheme to Phoneme N-gram [41,24], Neural-based [403,283,33, 320] --Polyphone Disambiguation [441,392,224,295,321,29,257] and then neural networks are leveraged to model text normalization as a sequence to sequence task where the source and target sequences are non-standard words and spoken-form words respectively [310,223,430]. Recently, some works [432] propose to combine the advantages of both rule-based and neural-based models to further improve the performance of text normalization.…”
Section: Task Research Workmentioning
confidence: 99%
“…Text Normalization Rule-based [311], Neural-based [310,223,406,430], Hybrid [432] Word Segmentation [394,444,261] POS Tagging [292,323,221,444,135] Prosody Prediction [50,405,312,186,137,322,277,62,440,210,212,3] Grapheme to Phoneme N-gram [41,24], Neural-based [403,283,33, 320] --Polyphone Disambiguation [441,392,224,295,321,29,257] and then neural networks are leveraged to model text normalization as a sequence to sequence task where the source and target sequences are non-standard words and spoken-form words respectively [310,223,430]. Recently, some works [432] propose to combine the advantages of both rule-based and neural-based models to further improve the performance of text normalization.…”
Section: Task Research Workmentioning
confidence: 99%
“…Some studies have explored solving PD by regarding pronunciation estimation (including non-polyphonic words) as a sequenceto-sequence problem, and by applying machine translation approaches [13,14]. For Mandarin, on the other hand, some studies adopt a classification approach that estimates the correct pinyin of the polyphonic character [15,16,17]. Because polyphonic words appear only in certain parts of the sentence, we regard PD as a classification problem, similar to the approach for Mandarin.…”
Section: Polyphone Disambiguation (Pd)mentioning
confidence: 99%
“…Homograph. Some characters could be pronounced in multiple ways depending on the textual context they resides [20]. 2.…”
Section: The Aishell-3 Datasetmentioning
confidence: 99%