2018
DOI: 10.12921/cmst.2018.0000005
|View full text |Cite
|
Sign up to set email alerts
|

Korpusomat – a Tool for Creating Searchable Morphosyntactically Tagged Corpora

Abstract: The paper presents Korpusomat, a web application aimed at building annotated corpora for the purpose of corpus linguistic studies. Korpusomat combines existing tools, such as morphological analyser, tagger and corpus search engine, and provides an easy-to-use environment for building corpora technically compatible with the National Corpus of Polish from almost any text, including texts in binary formats. In the paper we present the current state of the project, its features and functionalities, as well as some… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
1
0
3

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 21 publications
(5 citation statements)
references
References 2 publications
0
1
0
3
Order By: Relevance
“…To map the characteristics of written style of the data set, we used automated text processing tool—Korpusomat ( Kieraś et al, 2018 ). Corpus linguistics offers specific algorithms and tools to obtain information about language patterns based on frequency of occurrences and co-occurrences of individual words (segments) and their quantification.…”
Section: Methodsmentioning
confidence: 99%
“…To map the characteristics of written style of the data set, we used automated text processing tool—Korpusomat ( Kieraś et al, 2018 ). Corpus linguistics offers specific algorithms and tools to obtain information about language patterns based on frequency of occurrences and co-occurrences of individual words (segments) and their quantification.…”
Section: Methodsmentioning
confidence: 99%
“…19 th and early 20 th c.). Both corpora have been annotated grammatically in an automated manner and subsequently searched with the aid of the search engine Korpusomat (https://korpusomat.pl) (Kieraś, Kobyliński & Ogrodniczuk 2018). An important advantage of this application is its ability to work on individual sources of texts, the use of an annotation system that is compatible with the system employed in the National Corpus of the Polish Language, and, consequently, a similar syntax of searches, which allows for trans-corpus comparisons of the obtained results.…”
Section: Empirical Researchmentioning
confidence: 99%
“…Na potrzeby pracy zgromadzono materiał językowy, składający się z 76 tekstów internetowych zawierających leksemy covidianin lub covidianie (oraz ich formy fleksyjne). Teksty były gromadzone od 21 października 2020 do 24 stycznia 2021 roku i zostały zapisane w narzędziu Korpusomat (Kieraś et al 2018). Zgromadzone teks-3 https://wsjp.pl/ (dostęp: 25.10.2021).…”
Section: Materiał Językowyunclassified
“…Analiza kontekstów użycia leksemu covidianin za pomocą Korpusomatu (Kieraś et al 2018) pozwoliła ustalić, że badany wyraz pojawia się w tekstach w znaczeniu opisanym w przytoczonych wcześniej definicjach. W tekstach pojawiają się nawiązania do noszenia maseczek przez covidianina, na przykład:…”
Section: Covidianin -Analiza Korpusuunclassified