2003
DOI: 10.1007/978-3-540-24594-0_52
|View full text |Cite
|
Sign up to set email alerts
|

Segmenting Chinese Unknown Words by Heuristic Method

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2004
2004
2007
2007

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 6 publications
0
6
0
Order By: Relevance
“…In our previous work, we developed the boundary detection (Yang, Luk, Yung, & Yen, 2000) and the heuristic techniques to segment Chinese sentences based on mutual information and significant estimation (Chien, 1997). Our accuracy is over 90% (Yang & Li, 2003c).…”
Section: A Corpus‐based Approach: Automatic Crosslingual Concept Spacmentioning
confidence: 90%
“…In our previous work, we developed the boundary detection (Yang, Luk, Yung, & Yen, 2000) and the heuristic techniques to segment Chinese sentences based on mutual information and significant estimation (Chien, 1997). Our accuracy is over 90% (Yang & Li, 2003c).…”
Section: A Corpus‐based Approach: Automatic Crosslingual Concept Spacmentioning
confidence: 90%
“…Thresholding and abrupt changes of the values of mutual information are utilized for the detection of segmentation points. The heuristic method utilizes five rules to segment Chinese text based on the mutual information of bi-grams and significance estimation of tri-grams [4].…”
Section: Boundary Detection and Heuristic Methodsmentioning
confidence: 99%
“…Two statistical based Chinese text segmentation techniques have been developed by Yang et al, namely, boundary detection [3] and heuristic method [4]. Due to the limitation of the lexical statistics collected from the Chinese corpus, errors may occur in segmentation.…”
Section: Introductionmentioning
confidence: 99%
“…Statistics‐based approaches or hybrid approaches are proposed to solve the problem of unknown words (Banko et al, 2002; Li et al, 2004; Sproat & Shih, 1990; Yang & Li, 2003; Yang & Li, 2005). Given a large corpus of Chinese texts, the statistics‐based approaches measure the statistical association of characters in the corpus.…”
Section: Traditional Segmentationmentioning
confidence: 99%