The ParlaMint corpora of parliamentary proceedings

Erjavec, Tomaž; Ogrodniczuk, Maciej; Osenova, Petya; Ljubešić, Nikola; Simov, Kiril; Pančur, Andrej; Rudolf, Michał; Kopp, Matyáš; Barkarson, Starkaður; Steingrímsson, Steinþór; Çöltekin, Çağrı; Does, Jesse de; Depuydt, Katrien; Agnoloni, Tommaso; Venturi, Giulia; Pérez, María Calzada; Macedo, Luciana D. de; Navarretta, Costanza; Luxardo, Giancarlo; Coole, Matthew; Rayson, Paul; Morkevičius, Vaidas; Krilavičius, Tomas; Darģis, Roberts; Ring, Orsolya; Heusden, Ruben van; Marx, Maarten; Fišer, Darja

doi:10.1007/s10579-021-09574-0

Cited by 30 publications

(17 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This approach contrasts with previous work for similar user groups (e.g. [11,18], http://zoek.openraadsinformatie.nl), which typically focus on the dataor technology-driven innovations.…”

Section: Domain Specific Task Modelsmentioning

confidence: 91%

Adapting a Faceted Search Task Model for the Development of a Domain-Specific Council Information Search Engine

Schoegje

Vries

Pieters

2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Domain specialists such as council members may benefit from specialised search functionality, but it is unclear how to formalise the search requirements when developing a search system. We adapt a faceted task model for the purpose of characterising the tasks of a target user group. We first identify which task facets council members use to describe their tasks, then characterise council member tasks based on those facets. Finally, we discuss the design implications of these tasks for the development of a search engine.Based on two studies at the same municipality we identified a set of task facets and used these to characterise the tasks of council members. By coding how council members describe their tasks we identified five task facets: the task objective, topic aspect, information source, retrieval unit, and task specificity. We then performed a third study at a second municipality where we found our results were consistent.We then discuss design implications of these tasks because the task model has implications for 1) how information should be modelled, and 2) how information can be presented in context, and it provides implicit suggestions for 3) how users want to interact with information.Our work is a step towards better understanding the search requirements of target user groups within an organisation. A task model enables organisations developing search systems to better prioritise where they should invest in new technology.

show abstract

“…This approach contrasts with previous work for similar user groups (e.g. [11,18], http://zoek.openraadsinformatie.nl), which typically focus on the dataor technology-driven innovations.…”

Section: Domain Specific Task Modelsmentioning

confidence: 91%

Adapting a Faceted Search Task Model for the Development of a Domain-Specific Council Information Search Engine

Schoegje

Vries

Pieters

2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…The Turkish parliamentary corpus released as part of the ParlaMint project (Erjavec et al, 2021;Erjavec et al, 2022) contains the transcripts of the Turkish parliament (2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018)(2019)(2020)(2021), including approximately 43M words from 303 505 speeches delivered at the main proceedings of the parliament. The data also contains speaker information (name, gender, party affiliation) and automatic annotations including morphology, dependency parsing and named entities.…”

Section: Large-scale (Unannotated) Linguistic Data Collectionsmentioning

confidence: 99%

Resources for Turkish Natural Language Processing: A critical survey

Çöltekin¹,

Doğruöz²,

Çetinoğlu³

2022

Preprint

Self Cite

View full text Add to dashboard Cite

This paper presents a comprehensive survey of corpora and lexical resources available for Turkish. We review a broad range of resources, focusing on the ones that are publicly available. In addition to providing information about the available linguistic resources, we present a set of recommendations, and identify gaps in the data available for conducting research and building applications in Turkish Linguistics and Natural Language Processing.

show abstract

“…The speeches also contain marked-up transcriber comments, such as gaps in the transcription, interruptions, applause, etc. More information about the creation of the corpora, the common standard, and specifics of each national corpus can be found in (Erjavec et al 2022).…”

Section: Parlamint Project Backgroundmentioning

confidence: 99%

“…Processing these heterogeneous records is challenging. However, the recent ParlaMint project has produced unified corpora of parliamentary debates in 17 European parliaments, making them widely accessible (Erjavec et al 2022). This broadens the possible scope of analysis from individual countries to joint issues and differences.…”

Section: Introductionmentioning

confidence: 99%

Multi-aspect Multilingual and Cross-lingual Parliamentary Speech Analysis

Miok¹,

Tenorio²,

Osenova³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Parliamentary and legislative debate transcripts provide an exciting insight into elected politicians' opinions, positions, and policy preferences. They are interesting for political and social sciences as well as linguistics and natural language processing (NLP). Exiting research covers discussions within individual parliaments. In contrast, we apply advanced NLP methods to a joint and comparative analysis of six national parliaments (Bulgarian, Czech, French, Slovene, Spanish, and United Kingdom) between 2017 and 2020, whose transcripts are a part of the ParlaMint dataset collection. Using a uniform methodology, we analyze topics discussed, emotions, and sentiment. We assess if the age, gender, and political orientation of speakers can be detected from speeches. The results show some commonalities and many surprising differences among the analyzed countries.

show abstract

The ParlaMint corpora of parliamentary proceedings

Cited by 30 publications

References 28 publications

Adapting a Faceted Search Task Model for the Development of a Domain-Specific Council Information Search Engine

Adapting a Faceted Search Task Model for the Development of a Domain-Specific Council Information Search Engine

Resources for Turkish Natural Language Processing: A critical survey

Multi-aspect Multilingual and Cross-lingual Parliamentary Speech Analysis

Contact Info

Product

Resources

About