Ondřej Pražák scite author profile

Ondřej Pražák

5Publications

31Citation Statements Received

85Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of West Bohemia

Publications

Order By: Most citations

Cross-Lingual SRL Based upon Universal Dependencies

Pražák

Konopík

2017

View full text Add to dashboard Cite

In this paper, we introduce a cross-lingual Semantic Role Labeling (SRL) system with language independent features based upon Universal Dependencies. We propose two methods to convert SRL annotations from monolingual dependency trees into universal dependency trees. Our SRL system is based upon cross-lingual features derived from universal dependency trees and supervised learning that utilizes a maximum entropy classifier. We design experiments to verify whether the Universal Dependencies are suitable for the cross-lingual SRL. The results are very promising and they open new interesting research paths for the future.

show abstract

Czert – Czech BERT-like Model for Language Representation

Sido

Pražák

Přibáň

et al. 2021

View full text Add to dashboard Cite

This paper describes the training process of the first Czech monolingual language representation models based on BERT and ALBERT architectures. We pre-train our models on more than 340K of sentences, which is 50 times more than multilingual models that include Czech data. We outperform the multilingual models on 9 out of 11 datasets. In addition, we establish the new state-of-the-art results on nine datasets. At the end, we discuss properties of monolingual and multilingual models based upon our results. We publish all the pretrained and fine-tuned models freely for the research community.

show abstract

UWB at SemEval-2020 Task 1: Lexical Semantic Change Detection

Pražák¹,

Přibáň²,

Taylor³

et al. 2020

View full text Add to dashboard Cite

In this paper, we describe our method for detection of lexical semantic change, i.e., word sense changes over time. We examine semantic differences between specific words in two corpora, chosen from different time periods, for English, German, Latin, and Swedish. Our method was created for the SemEval 2020 Task 1: Unsupervised Lexical Semantic Change Detection. We ranked 1 st in Sub-task 1: binary change detection, and 4 th in Sub-task 2: ranked change detection. Our method is fully unsupervised and language independent. It consists of preparing a semantic vector space for each corpus, earlier and later; computing a linear transformation between earlier and later spaces, using Canonical Correlation Analysis and Orthogonal Transformation; and measuring the cosines between the transformed vector for the target word from the earlier corpus and the vector for the target word in the later corpus.

show abstract

UWB at SemEval-2016 Task 2: Interpretable Semantic Textual Similarity with Distributional Semantics for Chunks

Konopík

Pražák

Steinberger³

et al. 2016

View full text Add to dashboard Cite

We have built a simple corpus-based system to estimate words similarity in multiple languages with a count-based approach. After training on Wikipedia corpora, our system was evaluated on the multilingual subtask of SemEval-2017 Task 2 and achieved a good level of performance, despite its great simplicity. Our results tend to demonstrate the power of the dis-tributional approach in semantic similarity tasks, even without knowledge of the underlying language. We also show that di-mensionality reduction has a considerable impact on the results.

show abstract

Czert -- Czech BERT-like Model for Language Representation

Sido¹,

Pražák²,

Přibáň³

et al. 2021

Preprint

View full text Add to dashboard Cite

This paper describes the training process of the first Czech monolingual language representation models based on BERT and ALBERT architectures. We pre-train our models on more than 340K of sentences, which is 50 times more than multilingual models that include Czech data. We outperform the multilingual models on 7 out of 10 datasets. In addition, we establish the new state-of-the-art results on seven datasets. At the end, we discuss properties of monolingual and multilingual models based upon our results. We publish all the pretrained and fine-tuned models freely for the research community.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.