Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2014
DOI: 10.3115/v1/p14-2099
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Alignment of Privacy Policies using Hidden Markov Models

Abstract: To support empirical study of online privacy policies, as well as tools for users with privacy concerns, we consider the problem of aligning sections of a thousand policy documents, based on the issues they address. We apply an unsupervised HMM; in two new (and reusable) evaluations, we find the approach more effective than clustering and topic models.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
6
2
1

Relationship

3
6

Authors

Journals

citations
Cited by 45 publications
(26 citation statements)
references
References 10 publications
0
24
0
Order By: Relevance
“…Chapter III of the GDPR describes the rights of the data subjects; the first (Article 12) is the right to be informed about the service provider's privacy practices "in a concise, transparent, intelligible and easily accessible form, using clear and plain language." The service provider has to communicate its practices regarding data collection and sharing (Articles 13 and 14) as well as the rights of users associated with data collection and processing (Articles [15][16][17][18][19][20][21][22].…”
Section: Gdpr Backgroundmentioning
confidence: 99%
See 1 more Smart Citation
“…Chapter III of the GDPR describes the rights of the data subjects; the first (Article 12) is the right to be informed about the service provider's privacy practices "in a concise, transparent, intelligible and easily accessible form, using clear and plain language." The service provider has to communicate its practices regarding data collection and sharing (Articles 13 and 14) as well as the rights of users associated with data collection and processing (Articles [15][16][17][18][19][20][21][22].…”
Section: Gdpr Backgroundmentioning
confidence: 99%
“…The data used to train the classifier was composed of (1) a set of 1,000 privacy policies labeled as valid from the ACL/COLING 2014 privacy policies' dataset released by Ramanath et al [21] and (2) an invalid set consisting of the text from 1,000 web pages, fetched from random links within the homepages of the top 500 Alexa websites. We ensured that the latter pages do not have any of the keywords associated with privacy policies in their URL or title.…”
Section: A Policy Classifier Architecturementioning
confidence: 99%
“…Montemagni et al (2010) investigate the peculiarities of the language in legal text with respect to that in ordinary text by applying shallow parsing. Ramanath et al (2014) in-troduce an unsupervised model for the automatic alignment of privacy policies and show that Hidden Markov Models are more effective than clustering and topic models. Liu et al (2016a) modelled the language of vagueness in privacy policies using deep neural networks.…”
Section: Related Workmentioning
confidence: 99%
“…Costante et al [7] use text classification to estimate a policy's completeness based on topic coverage. Other approaches have applied topic modeling to privacy policies [6,29] and have automatically grouped related sections and paragraphs of privacy policies [15,24]. Since the complexity and vagueness of privacy policy language makes it difficult to automatically extract complex data practices from privacy policies, we propose to use relevance models to select paragraphs that pertain to a specific data practice and to highlight those paragraphs for annotators.…”
Section: Related Workmentioning
confidence: 99%