1998
DOI: 10.5715/jnlp.5.4_35
|View full text |Cite
|
Sign up to set email alerts
|

Analysis of Japanese Compound Nouns by Direct Text Scanning

Abstract: This paper aims to analyze word dependency structure in compound nouns appearing in Japanese newspaper articles. The analysis is a dil't:icult problem because such compound nouns can be quite long, have no word boundaries between contained nouns, and often contain nnregistered words such as abbreviations. The nonsegmentation property and unregistered words cause initial segmentation errors which result in erroneous analysis. This paper presents a corpus-based approach which scans a corpus with a set of pattern… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

1999
1999
2001
2001

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 2 publications
0
2
0
Order By: Relevance
“…To extract promising term candidates of compound noun and at the same time to exclude undesirable strings such as is a or of the, the most frequently used method is to filter out the words being members of the stop-list. In these days, more complex structures like noun phrases, collocations consisting of nouns, verbs, prepositions, determiners, and so on, become focused on (Smadja and McKeown 1990:252-259;Ananiadou 1996:41-46, Zhai andEvans 1996:17-23;Hisamitsu andNitta 1996:550-555, Shimohata et al 1997:476-481). All of these are good term candidates in a document or a specific domain because all of them have a strong unithood.…”
Section: Term Candidates Extraction Subsystemmentioning
confidence: 99%
“…To extract promising term candidates of compound noun and at the same time to exclude undesirable strings such as is a or of the, the most frequently used method is to filter out the words being members of the stop-list. In these days, more complex structures like noun phrases, collocations consisting of nouns, verbs, prepositions, determiners, and so on, become focused on (Smadja and McKeown 1990:252-259;Ananiadou 1996:41-46, Zhai andEvans 1996:17-23;Hisamitsu andNitta 1996:550-555, Shimohata et al 1997:476-481). All of these are good term candidates in a document or a specific domain because all of them have a strong unithood.…”
Section: Term Candidates Extraction Subsystemmentioning
confidence: 99%
“…To analyze the structure of compound nouns in Japanese, one of several methods uses collocation information and a thesaurus Tanaka 1994, 1995). Another method is a corpus-based approach that scans a corpus with a set of pattern matchers and gathers co-occurrence examples to analyze compound nouns (Hisamitsu and Nitta 1996). For analyzing compound nouns in English, one method compares the adjacency model and the dependency model and concludes that the dependency model provides a substantial advantage over the adjacency model (Lauer 1995).…”
Section: Related Workmentioning
confidence: 99%