2022
DOI: 10.1109/taslp.2021.3129334
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Character Embedding Correction and Candidate Word Denoising

Abstract: In this paper, we take Indonesian as the research object, and propose a multiple filter correction framework (MFCF). The main idea of MFCF is to remove noise from candidate words to increase the probability of correct words being selected. In MFCF, we use window search algorithm (WSA) to filter the candidate words in the dictionary. When searching for candidate words whose Levenshtein distance is 1, WSA reduces the candidate word search space by an average of 71%. When searching for candidate words whose Leven… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 22 publications
(24 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?