While large-scale corpora and various corpus query tools have long been recognized as essential language resources, the value of word association norms as language resources has been largely overlooked. This paper conducts some initial comparisons of the lexical relationships observed within Japanese collocation data extracted from a large corpus using the Japanese language version of the Sketch Engine (SkE) tool (Srdanović et al., 2008) and the relationships found within Japanese word association sets taken from the large-scale Japanese Word Association Database (JWAD) under ongoing construction by Joyce (2005, 2007). The comparison results indicate that while some relationships are common to both linguistic resources, many lexical relationships are only observed in one resource. These findings suggest that both resources are necessary in order to more adequately cover the diverse range of lexical relationships. Finally, the paper reflects briefly on the implementation of association-based word-search strategies into electronic dictionaries proposed by Zock and Bilac (2004) and Zock (2006).
In this paper, we explore learner production of adjectives using the Japanese language learner's corpus C-JAS (Corpus of Japanese As a Second language). Firstly, we describe the overall usage of adjectives in the corpus and discuss the distribution of the adjectives among learners including their correct and incorrect usages. Then, we take the frequently used adjective takai "high/tall/expensive" as an example and show how the learners' production of adjectives develops in terms of form, correct/incorrect usages, and lexico-semantic coverage.
In this paper, we present results of an evaluation of Japanese word sketches and address in detail issues that were observed by the evaluators. A word sketch presents a list of salient collocates of a word, organized by the grammatical relations holding between the word and its collocate. The word sketch functionality is incorporated into the Sketch Engine corpus query system and has been created for more than twenty languages so far, including Japanese. The issues that have been discovered in the evaluation of word sketches in Japanese are to be addressed for further enhancement of the word sketch functionality. Other tools and resources which are combined for use and influence the performance of the word sketches should also be looked over. We divide the issues into the following: 1) the lemmatizer and tagger in use, 2) the sketch grammar that is specifically written for Japanese, and 3) the corpus and statistical methods.
Keywordsword sketches, Japanese collocations, evaluation, corpus, language technologies Izvleček V prispevku predstavljamo rezultate ocenjevanja japonskih besednih skic in podrobno prikazujemo probleme in težave, ki smo jih opazili ocenjevalci. Besedna skica je seznam izstopajočih kolokacij neke besede, ki ga organizirajo slovnične relacije med besedo in drugimi besedami, ki skupaj sestavaljajo kolokacije. Funkcije besedne skice so vgrajene v korpusno orodje Sketch Engine in na voljo trenutno že v več kot dvajsetih jezikih, med njimi tudi v japonščini. Problemi in težave, ki smo jih odkrili med ocenjevanjem besednih skic v japonščini, moramo dalje proučiti za okrepitev funkcij besednih skic. Problemi in težave so pri naslednjih: 1) pri sistemu ugotavljanja osnovne oblike besede in označevalcu besednih vrst v rabi; 2) v slovnici za skice, ki je napisana posebej za japonščino; 3) pri korpusu in statističnih metodah.
Ključne besedebesedne skice, kolokacije v japonščini, evalvacija/ocenjevanje, korpus, jezikovne tehnologije 64 Irena SRDANOVIĆ, Naomi IDA, …
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.