2020
DOI: 10.46298/jdmdh.5864
|View full text |Cite
|
Sign up to set email alerts
|

Deep Learning for Period Classification of Historical Hebrew Texts

Abstract: In this study, we address the interesting task of classifying historical texts by their assumed period of writ-ing. This task is useful in digital humanity studies where many texts have unidentified publication dates.For years, the typical approach for temporal text classification was supervised using machine-learningalgorithms. These algorithms require careful feature engineering and considerable domain expertise todesign a feature extractor to transform the raw text into a feature vector from which the clas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
1
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 32 publications
0
1
0
Order By: Relevance
“…Traditional machine learning methods focus on statistical features and learning models, such as Naïve Bayes (Boldsen and Wahlberg, 2021), SVM (Garcia-Fernandez et al, 2011) and Random Forests (Ciobanu et al, 2013). Recent studies turn to deep learning methods, and the experiments show their superior performances compared to traditional machine learning ones (Kulkarni et al, 2018;Liebeskind and Liebeskind, 2020;Yu and Huangfu, 2019;Ren et al, 2022). Pre-trained models are also leveraged to represent texts for the dating task, such as Sentence-BERT (Massidda, 2020;Tian and Kübler, 2021) and RoBERTa .…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Traditional machine learning methods focus on statistical features and learning models, such as Naïve Bayes (Boldsen and Wahlberg, 2021), SVM (Garcia-Fernandez et al, 2011) and Random Forests (Ciobanu et al, 2013). Recent studies turn to deep learning methods, and the experiments show their superior performances compared to traditional machine learning ones (Kulkarni et al, 2018;Liebeskind and Liebeskind, 2020;Yu and Huangfu, 2019;Ren et al, 2022). Pre-trained models are also leveraged to represent texts for the dating task, such as Sentence-BERT (Massidda, 2020;Tian and Kübler, 2021) and RoBERTa .…”
Section: Related Workmentioning
confidence: 99%
“…One is to learn word representations by diachronic documents. Current research on word representation either learn static word embedding throughout the corpus (Liebeskind and Liebeskind, 2020;Yu and Huangfu, 2019), or learn dynamic word representations using pre-trained models (Tian and Kübler, 2021). However, neither of them takes into account the relation between time and word meaning.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Existing NLP studies on historical documents primarily focus on tasks such as spelling normalization [18], [23], machine translation [24], and sequence labelling, including part-of-speech tagging [25] and named entity recognition [19], [26]. Recently, the success of deep neural networks has introduced new applications in this domain, including sentiment analysis [27], information retrieval [28], event extraction [29], [30], and text classification [31]. However, only a limited amount of research has been conducted on historical text summarization.…”
Section: Historical Natural Language Processing Applicationsmentioning
confidence: 99%
“…Refs. [ 7 , 43 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 68 , 69 , 70 , 71 , 72 , 73 ] are cited in the Supplementary Materials.…”
mentioning
confidence: 99%