Pawel Kamocki scite author profile

Pawel Kamocki

17Publications

7Citation Statements Received

51Citation Statements Given

How they've been cited

How they cite others

107

Affiliations

Leibniz Institute for the German Language

Publications

Order By: Most citations

Personal data protection and academia: GDPR issues and multi-modal data-collections "in the wild"

Siegert

Silber-Varod

Carmi

et al. 2020

OJAKM

View full text Add to dashboard Cite

The European Union (EU) General Data Protection Regulations (GDPR) has a direct impact on research activities, as it raises the awareness of personal rights not only among the scientists but also among the data-subjects scientists process information from. This paper presents the dilemma related to the privacy of audio and video data, compliance with the EU GDPR, and techniques to anonymize and pseudonymize such data. We further discuss issues of “in the wild” personal data collection by focusing on multi-modal collections, mainly of audio, video via these channels. Throughout this paper we define relevant core issues and highlight two challenges of “in the wild” data collection: Internet crawling and public data collecting. In the last section, some exemplary use cases are demonstrating the raised issues, illuminating how GDPR affects the collection of publicly available data; how privacy concerns influence participant behavior, and which de-anonymization levels can be reached with what kind of data. The key point we present is that the identity of the participants is revealed in the voice or video signal, while the latter is at the same time the object of the research. One implication is that the research community has to actively disconnect the data from the personal information on the participants. Hence the importance of a process of anonymity or omission of data for research activity. This entail the development of an infrastructure for data access control to enable data sharing among researchers

show abstract

"Equation missing" All Your Data Are Belong to us"Equation missing" . European Perspectives on Privacy Issues in ‘Free’ Online Machine Translation Services

Kamocki¹,

O'Regan²,

Stauch³

2016

View full text Add to dashboard Cite

International audienceThe English language has taken advantage of the Digital Revolution to establish itself as the global language; however, only 28.6 % of Internet users speak English as their native language. Machine Translation (MT) is a powerful technology that can bridge this gap. In development since the mid-20th century, MT has become available to every Internet user in the last decade, due to free online MT services. This paper aims to discuss the implications that these tools may have for the privacy of their users and how they are addressed by EU data protection law. It examines the data-flows in respect of the initial processing (both from the perspective of the user and the MT service provider) and potential further processing that may be undertaken by the MT service provider

show abstract

The CLARIN infrastructure as an interoperable language technology platform for SSH and beyond

Branco

Eskevich²,

Frontini³

et al. 2023

Lang Resources & Evaluation

View full text Add to dashboard Cite

CLARIN is a European Research Infrastructure Consortium developing and providing a federated and interoperable platform to support scientists in the field of the Social Sciences and Humanities in carrying-out language-related research. This contribution provides an overview of the entire infrastructure with a particular focus on tool interoperability, ease of access to research data, tools and services, the importance of sharing knowledge within and across (national) communities, and community building. By taking into account FAIR principles from the very beginning, CLARIN succeeded in becoming a successful example of a research infrastructure that is actively used by its members. The benefits CLARIN members reap from their infrastructure secure a future for their common good that is both sustainable and attractive to partners beyond the original target groups.

show abstract

Rechtliche Bedingungen für die Bereitstellung eines Chat-Korpus in CLARIN-D: Ergebnisse eines Rechtsgutachtens

Beißwenger¹,

Lüngen²,

Schallaböck³

et al. 2017

View full text Add to dashboard Cite

Der vorliegende Band präsentiert Ergebnisse aus Forschungsarbeiten, die im Zusammenhang mit dem wissenschaftlichen Netzwerk "Empirische Erforschung internetbasierter Kommunikation" (Empirikom) entstanden sind, das von 2010 bis 2014 von der Deutschen Forschungsgemeinschaft (DFG) gefördert wurde. Am Netzwerk beteiligt waren 15 Mitglieder aus Linguistik, Computerlinguistik, Informatik und Psychologie sowie 23 assoziierte Mitglieder und Kooperationspartner mit einem gemeinsamen Interesse an Fragestellungen im Zusammenhang mit der empirischen, ressourcengestützten Analyse von Sprachdaten aus Formen internetbasierter Kommunikation (IBK). Darunter wurden solche Formen der Sprachverwendung subsumiert, die dialogisch und interaktional organisiert sind und für deren Zustandekommen Computernetze die infrastrukturelle Voraussetzung darstellen. Prominente IBK-Formen sind Chats, Newsgroups und Online-Foren, Weblog-Kommentare, Tweets, Wikipedia-Diskussionen, SMS-, WhatsApp-und Instant-Messaging-Interaktionen, Skype sowie entsprechende Kommunikationsfunktionen in sozialen Netzwerken, Online-Computerspielen und ‚virtuellen Welten'. IBK-Formen bilden eine wichtige Komponente vieler Social-Media-Anwendungen und werden insbesondere in den letzten Jahren immer stärker auch mobil genutzt. 1 || 1 Zur terminologischen Konzeptualisierung des Gegenstands gibt es in der Forschungsliteratur unterschiedliche Vorschläge. Am ältesten und nach wie vor verbreitetsten ist die Etikettierung als computer-mediated communication (CMC, z. B. Herring 1996), ins Deutsche lehnübersetzt als Computervermittelte Kommunikation. Der Terminus Internetbasierte Kommunikation (IBK, z. B. Beißwenger et al. 2004) wurde um die Jahrtausendwende als zeitgemäßere Alternative zu CMC geprägt und grenzt die Kommunikation auf Basis von TCP/IP von anderen Formen computervermittelter Kommunikation ab (Auch Briefe und Telefongespräche werden heutzutage unter Beteiligung von Computern vermittelt). Jucker/Dürscheid (2012) schlagen die Bezeichnung Keyboard-to-screen-Kommunikation vor, die die Spezifik der Ein-/Ausgabedimension fokussiert. Auch die Beiträge des vorliegenden Bandes verwenden variierende Bezeichnungen. Die Wahl des Terminus Internetbasierte Kommunikation für die Namensgebung des Netzwerks

show abstract

Datenschutz in der wissenschaftlichen Praxis - Der DARIAH-EU ELDAH Consent Form Wizard

Scholger¹,

Hannesschläger²,

Kamocki

et al. 2022

View full text Add to dashboard Cite

The CLARIN Committee for Legal and Ethical Issues and the Normative Layer of the CLARIN Infrastructure

Kamocki¹,

Kelli²,

Lindén³

2022

View full text Add to dashboard Cite

Trust is Good, Control is Better? 1 The GDPR and Control Over Personal Data in Digital Humanities Research

Kamocki¹

2021

View full text Add to dashboard Cite

The argument for ‘non-consumptive use’ in the EU: how copyright could be redefined to allow text and data mining

Kamocki¹

2018

View full text Add to dashboard Cite

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Pawel Kamocki

Personal data protection and academia: GDPR issues and multi-modal data-collections "in the wild"

"Equation missing" All Your Data Are Belong to us"Equation missing" . European Perspectives on Privacy Issues in ‘Free’ Online Machine Translation Services

The CLARIN infrastructure as an interoperable language technology platform for SSH and beyond

Rechtliche Bedingungen für die Bereitstellung eines Chat-Korpus in CLARIN-D: Ergebnisse eines Rechtsgutachtens

Datenschutz in der wissenschaftlichen Praxis - Der DARIAH-EU ELDAH Consent Form Wizard

The CLARIN Committee for Legal and Ethical Issues and the Normative Layer of the CLARIN Infrastructure

Trust is Good, Control is Better? 1 The GDPR and Control Over Personal Data in Digital Humanities Research

The argument for ‘non-consumptive use’ in the EU: how copyright could be redefined to allow text and data mining

Contact Info

Product

Resources

About