2000
DOI: 10.1002/(sici)1097-4571(2000)51:4<340::aid-asi4>3.0.co;2-i
|View full text |Cite
|
Sign up to set email alerts
|

Combination and boundary detection approaches on Chinese indexing

Abstract: Digital libraries store materials in electronic format. Research and development in digital libraries includes content creation, conversion, indexing, organization, and dissemination. The key technological issues are how to search and display desired selections from and across large collections effectively [Schatz & Chen, 1996]. Digital library research projects (DLI-1) sponsored by NSF/ DARPA/NASA have a common theme of bringing search to the net, which is the flagship research effort for the National Informa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
30
0

Year Published

2001
2001
2010
2010

Publication Types

Select...
6
2

Relationship

5
3

Authors

Journals

citations
Cited by 33 publications
(30 citation statements)
references
References 11 publications
0
30
0
Order By: Relevance
“…Add a prefix tag judgments, and the maximum matching algorithm as being a dynamic value to return (3). The principle of priority based on long term, the maximum term and a maximum separation of the word, better solve the ambiguity problem, make up the largest sub-word segmentation algorithm and the algorithm.…”
Section: B Improved Patricia Dictionary Mechanismsmentioning
confidence: 99%
See 2 more Smart Citations
“…Add a prefix tag judgments, and the maximum matching algorithm as being a dynamic value to return (3). The principle of priority based on long term, the maximum term and a maximum separation of the word, better solve the ambiguity problem, make up the largest sub-word segmentation algorithm and the algorithm.…”
Section: B Improved Patricia Dictionary Mechanismsmentioning
confidence: 99%
“…Check dictionary, and it returns whether the word is the biggest word. If Is the word the word, the word is not the biggest offset offset +1, return to (3). Is the largest word is the starting position start +1, return to (2); (5).…”
Section: B Improved Patricia Dictionary Mechanismsmentioning
confidence: 99%
See 1 more Smart Citation
“…Word boundary identification is even harder in Asian languages such as Chinese (Chen and Liu 1992;Yang et al 2000;Foo and Li 2004), for example, since words are not delimited by blanks. Foo and Li (2004) conducted experiments to study the impact of Chinese word segmentation and its effect on IR.…”
Section: Pre Processing and Text Segmentationmentioning
confidence: 99%
“…Two statistical based Chinese text segmentation techniques have been developed by Yang et al, namely, boundary detection [3] and heuristic method [4]. Due to the limitation of the lexical statistics collected from the Chinese corpus, errors may occur in segmentation.…”
Section: Introductionmentioning
confidence: 99%