2014
DOI: 10.15394/jdfsl.2014.1179
|View full text |Cite
|
Sign up to set email alerts
|

“Time for Some Traffic Problems”: Enhancing E-Discovery and Big Data Processing Tools with Linguistic Methods for Deception Detection

Abstract: Linguistic deception theory provides methods to discover potentially deceptive texts to make them accessible to clerical review. This paper proposes the integration of these linguistic methods with traditional e-discovery techniques to identify deceptive texts within a given author's larger body of written work, such as their sent email box. First, a set of linguistic features associated with deception are identified and a prototype classifier is constructed to analyze texts and describe the features' distribu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0
4

Year Published

2019
2019
2022
2022

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(11 citation statements)
references
References 11 publications
0
7
0
4
Order By: Relevance
“…IoT: Internet of things; PB: petabytes; ZB: zettabyte; EB: exabyte; IDC: international data corporation; AI: artificial intelligence; ML: machine learning; NLP: natural language processing; CI: computational intelligence; FSVM: fuzzy support vector machines; SVM: support vector machines; POS: part-of-speech; ICA: IBM content analytics; EAs: evolutionary algorithms; ANN: artificial neural networks. [65,66], Deep learning [15,63], Fuzzy sets [67], Feature selection [9,60,61] Learning from unlabeled data Active learning [65,66] Scalability Distributed learning [12,63] Deep learning [56] Natural language processing Keyword search Fuzzy, Bayesian [68,70,71] Ambiguity of words in POS ICA [73], LIBLINEAR and MNB algorithm [68] Classification (simplifying language assumption)…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…IoT: Internet of things; PB: petabytes; ZB: zettabyte; EB: exabyte; IDC: international data corporation; AI: artificial intelligence; ML: machine learning; NLP: natural language processing; CI: computational intelligence; FSVM: fuzzy support vector machines; SVM: support vector machines; POS: part-of-speech; ICA: IBM content analytics; EAs: evolutionary algorithms; ANN: artificial neural networks. [65,66], Deep learning [15,63], Fuzzy sets [67], Feature selection [9,60,61] Learning from unlabeled data Active learning [65,66] Scalability Distributed learning [12,63] Deep learning [56] Natural language processing Keyword search Fuzzy, Bayesian [68,70,71] Ambiguity of words in POS ICA [73], LIBLINEAR and MNB algorithm [68] Classification (simplifying language assumption)…”
Section: Discussionmentioning
confidence: 99%
“…For example, a keyword search usually matches exact strings and ignores words with spelling errors that may still be relevant. Boolean operators and fuzzy search technologies permit greater flexibility in that they can be used to search for words similar to the desired spelling [70]. Although…”
Section: Natural Language Processing and Big Datamentioning
confidence: 99%
“…Se ha llegado a reportar discriminación del engaño cercana a un 90 % cuando esta metodología es complementada con análisis estadísticos robustos (Hernández y Calvo, 2017). Acorde con lo anterior y sumada a la agilidad de su implementación, se convierte en una alternativa útil para ser aplicada en contextos forenses reales (Crabb, 2014;Vrij et al, 2007).…”
Section: Estilo Lingüísticounclassified
“…Sin embargo, esta alternativa exige entrenamiento para ser empleada correctamente y no ofrece la agilidad suficiente para ser utilizada ampliamente en contextos reales que precisen rapidez para revisar la extensa cantidad de elementos probatorios (Fuller et al, 2009;Kleinberg et al, 2017;Masip et al, 2012). En este sentido, el estilo lingüístico ofrece una alternativa objetiva para abarcar gran cantidad de información en cortos períodos, lo que la convierte en una opción viable en contextos forenses, que no solo podría abordar la credibilidad de un relato, sino también los procesos psicológicos que subyacen al comportamiento (Crabb, 2014;Lee, 2017;Vrij et al, 2007).…”
Section: Conclusionesunclassified
“…As a scientific endeavour it dates back at least to the 19th century [104,112], and was formulated as a computational task in the 1960s [118,163]. In contemporary work, the traditional focus on literary documents has largely been overshadowed by the increased use of online datasets, such as blog posts [121], e-mails [32,37], forum discussions [183], SMS messages [138], and tweets [25]. Neal et al [123] comprehensively survey the state-of-the-art in stylometry.…”
Section: Introductionmentioning
confidence: 99%