2016
DOI: 10.1162/tacl_a_00084
|View full text |Cite
|
Sign up to set email alerts
|

Decoding Anagrammed Texts Written in an Unknown Language and Script

Abstract: Algorithmic decipherment is a prime example of a truly unsupervised problem. The first step in the decipherment process is the identification of the encrypted language. We propose three methods for determining the source language of a document enciphered with a monoalphabetic substitution cipher. The best method achieves 97% accuracy on 380 languages. We then present an approach to decoding anagrammed substitution ciphers, in which the letters within words have been arbitrarily transposed. It obtains the avera… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
18
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
3
3
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 23 publications
(22 citation statements)
references
References 13 publications
1
18
0
Order By: Relevance
“…Finally, to realize a fully-automatic camera-phone decipherment app, we need to lift several assumptions we made in the paper. These include knowing the plaintext language and cipher system [12], [17], [18], pre-processing images to remove margins and non-cipher text, knowing the cipher alphabet size, and cipher-specific setting of segmentation parameters. Figure 12: 50 restarts of LM-GMM on Borg dataset initialized from 3-stage decipherment with random noise (+).…”
Section: Discussionmentioning
confidence: 99%
“…Finally, to realize a fully-automatic camera-phone decipherment app, we need to lift several assumptions we made in the paper. These include knowing the plaintext language and cipher system [12], [17], [18], pre-processing images to remove margins and non-cipher text, knowing the cipher alphabet size, and cipher-specific setting of segmentation parameters. Figure 12: 50 restarts of LM-GMM on Borg dataset initialized from 3-stage decipherment with random noise (+).…”
Section: Discussionmentioning
confidence: 99%
“…Other authors, as most researchers in the past, think that the text is enciphered in some way. Hauer and Kondrak assumed that the manuscript was written in some abjad's alphabet using some transposition of letters or anagramming [38]. These assumptions are very strong and their conclusions of a relation to Hebrew have been widely dismissed.…”
Section: Discussionmentioning
confidence: 99%
“…Other authors, as most researchers in the past, think that the text is enciphered in some way. Hauer and Kondrak assumed that the manuscript is written in some Abjad's alphabet using some transposition of letters or anagramming [22]. These assumptions are very strong and their conclusions of a relation to Hebrew have been widely dismissed.…”
Section: Discussionmentioning
confidence: 99%