2005
DOI: 10.1007/11562214_60
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Acquisition of Basic Katakana Lexicon from a Given Corpus

Abstract: Abstract. Katakana, Japanese phonogram mainly used for loan words, is a trou-blemaker in Japanese word segmentation. Since Katakana words are heavily domain-dependent and there are many Katakana neologisms, it is almost impossible to construct and maintain Katakana word dictionary by hand. This paper proposes an automatic segmentation method of Japanese Katakana compounds, which makes it possible to construct precise and concise Katakana word dictionary automati-cally, given only a medium or large size of Japa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
9
0

Year Published

2012
2012
2014
2014

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(9 citation statements)
references
References 3 publications
0
9
0
Order By: Relevance
“…Although the frequency-based method generally achieves high recall, its level of precision is not satisfactory (Koehn and Knight 2003;Nakazawa et al 2005). Our experiments empirically compared our method with frequency-based methods, and the results demonstrated the advantages of our method.…”
Section: Compound Splittingmentioning
confidence: 84%
See 3 more Smart Citations
“…Although the frequency-based method generally achieves high recall, its level of precision is not satisfactory (Koehn and Knight 2003;Nakazawa et al 2005). Our experiments empirically compared our method with frequency-based methods, and the results demonstrated the advantages of our method.…”
Section: Compound Splittingmentioning
confidence: 84%
“…A common approach to splitting compounds without expensive linguistic resources is an unsupervised method based on word and string frequencies estimated from unlabeled text (Koehn and Knight 2003;Ando and Lee 2003;Schiller 2005;Nakazawa et al 2005;Holz and Biemann 2008). Nakazawa et al (2005) also investigated methods of splitting katakana noun compounds.…”
Section: Compound Splittingmentioning
confidence: 99%
See 2 more Smart Citations
“…(compounding) (Tsujimura 2006) 2 (Koehn and Knight 2003) Braschler (Braschler and Ripplinger 2004) (Schwartz and Hearst 2003;Okazaki, Ananiadou, and Tsujii 2008) Alfonseca (2008) (Brown 2002;Koehn and Knight 2003;Nakazawa, Kawahara, and Kurohashi 2005) (Brill, Kacmarcik, and Brockett 2001;Nakazawa et al 2005;Breen 2009) Breen 200920% (Nakazawa et al 2005) 2.1 (Koehn and Knight 2003;Ando and Lee 2003;Schiller 2005;Nakazawa et al 2005;Holz and Biemann 2008) (Nakazawa et al 2005) (Koehn and Knight 2003;Nakazawa et al 2005 (Nakazawa et al 2005) 5 (Brill et al 2001;Cao, Gao, and Nie 2007;Oh and Isahara 2008;Wu, Okazaki, and Tsujii 2009) 3 Brill et al 2001) x y 4.2 (Schiller 2005;Alfonseca et al 2008): (Koehn and Knight 2003):…”
mentioning
confidence: 99%