Aivaras Rokas scite author profile

Aivaras Rokas

5Publications

7Citation Statements Received

37Citation Statements Given

How they've been cited

How they cite others

Affiliations

Vytautas Magnus University

Publications

Order By: Most citations

Automatic Extraction of Lithuanian Cybersecurity Terms Using Deep Learning Approaches

Rokas¹,

Rackevičienė²,

Utka³

2020

View full text Add to dashboard Cite

The paper presents the results of research on deep learning methods aiming to determine the most effective one for automatic extraction of Lithuanian terms from a specialized domain (cybersecurity) with very restricted resources. A semi-supervised approach to deep learning was chosen for the research as Lithuanian is a less resourced language and large amounts of data, necessary for unsupervised methods, are not available in the selected domain. The findings of the research show that Bi-LSTM network with Bidirectional Encoder Representations from Transformers (BERT) can achieve close to state-of-the-art results.

show abstract

Lithuanian corpus of the EU primary and secondary law acts of the period 2015-2017

Rokas¹,

Rackevičienė²

2018

View full text Add to dashboard Cite

Methodological Framework for the Development of an English-Lithuanian Cybersecurity Termbase

Rackevičienė

Mockienė

Utka

et al. 2021

StALan

View full text Add to dashboard Cite

The aim of the paper is to present a methodological framework for the development of an English-Lithuanian bilingual termbase in the cybersecurity domain, which can be applied as a model for other language pairs and other specialised domains. It is argued that the presented methodological approach can ensure creation of high-quality bilingual termbases even with limited available resources. The paper touches upon the methods and problems of dataset (corpora) compilation, terminology annotation, automatic bilingual term extraction (BiTE) and alignment, knowledge-rich context extraction, and linguistic linked open data (LLOD) technologies. The paper presents theoretical considerations as well as the arguments on the effectiveness of the described methods. The theoretical analysis and a pilot study allow arguing that: 1) a combination of parallel and comparable corpora enable to considerably expand the amount and variety of data sources that can be used for terminology extraction; this methodology is especially important for less-resourced languages which often lack parallel data; 2) deep learning systems trained by using manually annotated data (gold standard corpora) allow effective automatization of extraction of terminological data and metadata, which enables to regularly update termbases with minimised manual input; 3) LLOD technologies enable to integrate the terminological data into the global linguistic data ecosystem and make it reusable, searchable and discoverable across the Web.

show abstract

Building of Parallel and Comparable Cybersecurity Corpora for Bilingual Terminology Extraction

Utka¹,

Rackevičienė²,

Mockienė³

et al. 2022

View full text Add to dashboard Cite

English-Lithuanian parallel cybersecurity corpus - DVITAS

Utka¹,

Rackevičienė²,

Rokas³

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Aivaras Rokas

Automatic Extraction of Lithuanian Cybersecurity Terms Using Deep Learning Approaches

Lithuanian corpus of the EU primary and secondary law acts of the period 2015-2017

Methodological Framework for the Development of an English-Lithuanian Cybersecurity Termbase

Building of Parallel and Comparable Cybersecurity Corpora for Bilingual Terminology Extraction

English-Lithuanian parallel cybersecurity corpus - DVITAS

Contact Info

Product

Resources

About