2018
DOI: 10.48550/arxiv.1803.09337
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Text Segmentation as a Supervised Learning Task

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(9 citation statements)
references
References 0 publications
0
9
0
Order By: Relevance
“…The experimental results for the segmentation experiments are in Table 1a. Our baseline system that does not use any pretraining of representations is similar to the one proposed in Koshorek et al (2018) with the difference that our system uses character information to generate the word embeddings. Note that pretaining results in substantial improvements over the baseline systems in all settings.…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…The experimental results for the segmentation experiments are in Table 1a. Our baseline system that does not use any pretraining of representations is similar to the one proposed in Koshorek et al (2018) with the difference that our system uses character information to generate the word embeddings. Note that pretaining results in substantial improvements over the baseline systems in all settings.…”
Section: Resultsmentioning
confidence: 99%
“…We choose LSTM-based architectures with a standard hierarchical structure that has been useful for capturing long-term context in document-level tasks (Serban et al, 2016;Koshorek et al, 2018). We experiment with two related document-level representations.…”
Section: Hierarchical Document-level Representationsmentioning
confidence: 99%
See 2 more Smart Citations
“…However, ancient document images suffer from critical challenges including varying noise conditions, interfering annotations, typical ancient record artifacts like fading and vanishing texts, and variations in handwriting making it difficult to transcribe [27]. Over the past decade, various approaches have been proposed to solve document analysis and recognition such as optical character recognition (OCR) [26], layout analysis [28], text segmentation [19] and handwriting recognition [34,10,9,13]. Although OCR models have been very successful in recognizing machine print text, they stumble upon handwriting recognition due to aforementioned challenges and connecting characters in the text as compared to machine print ones where the characters are easily separable.…”
Section: Introductionmentioning
confidence: 99%