A Probabilistic Formulation of Keyword Spotting

Pérez, Joan Puigcerver I

doi:10.4995/thesis/10251/116834

Cited by 8 publications

(10 citation statements)

References 123 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A possible direction would consist of the use of linguistic statistics [9]. A recent method for using language information is a dual-state word-beam search [10] for decoding the connectionist temporal classification (CTC [11]) layer of neural networks, which has been shown to be effective [10].…”

Section: Introductionmentioning

confidence: 99%

A limited-size ensemble of homogeneous CNN/LSTMs for high-performance word classification

Ameryan

Schomaker

2021

Neural Comput & Applic

View full text Add to dashboard Cite

The strength of long short-term memory neural networks (LSTMs) that have been applied is more located in handling sequences of variable length than in handling geometric variability of the image patterns. In this paper, an end-to-end convolutional LSTM neural network is used to handle both geometric variation and sequence variability. The best results for LSTMs are often based on large-scale training of an ensemble of network instances. We show that high performances can be reached on a common benchmark set by using proper data augmentation for just five such networks using a proper coding scheme and a proper voting scheme. The networks have similar architectures (convolutional neural network (CNN): five layers, bidirectional LSTM (BiLSTM): three layers followed by a connectionist temporal classification (CTC) processing step). The approach assumes differently scaled input images and different feature map sizes. Three datasets are used: the standard benchmark RIMES dataset (French); a historical handwritten dataset KdK (Dutch); the standard benchmark George Washington (GW) dataset (English). Final performance obtained for the word-recognition test of RIMES was 96.6%, a clear improvement over other state-of-the-art approaches which did not use a pre-trained network. On the KdK and GW datasets, our approach also shows good results. The proposed approach is deployed in the Monk search engine for historical-handwriting collections.

show abstract

Section: Introductionmentioning

confidence: 99%

A limited-size ensemble of homogeneous CNN/LSTMs for high-performance word classification

Ameryan

Schomaker

2021

Neural Comput & Applic

View full text Add to dashboard Cite

show abstract

“…Nevertheless, of course, our results do still leave significant room for improvement, and we do think that in many cases it might actually come from the use of textual features which, in future works we plan extract using a recent methodology known as ''probabilistic indexing'' [17,23].…”

Section: Methodsmentioning

confidence: 97%

Reading order detection on handwritten documents

Quirós

Vidal

2022

Neural Comput & Applic

View full text Add to dashboard Cite

Recent advances in Handwritten Text Recognition and Document Layout Analysis have made it possible to convert digital images of manuscripts into electronic text. However, providing this text with the correct structure and context is still an open problem that needs to be solved to actually enable extracting the relevant information conveyed by the text. The most important structure needed for a set of text elements is their reading order. Most of the studies on the reading order problem are rule-based approaches and focus on printed documents. Much less attention has been paid so far to handwritten text documents, where the problem becomes particularly important—and challenging. In this work, we propose a new approach to automatically determine the reading order of text regions and lines in handwritten text documents. The task is approached as a sorting problem where the order-relation operator is automatically learned from examples. We experimentally demonstrate the effectiveness of our method on three different datasets at different hierarchical levels.

show abstract

“…This de facto standard evaluation measures the text line segmentation as per its extraction polygon which incorrectly diminished the importance of the detection subtask. Moreover, the line extraction accuracy results obtained with this measure present little correlation with the transcription accuracy results of the systems using the extracted lines [21].…”

Section: Introductionmentioning

confidence: 84%

“…Usage of Conditional Random Fields (CRF) was tried out in different articles but it did not fare well in comparison to Stochastic Context Free Grammars [2,6]. Markov Random Fields have seen minimal use to differentiate between printed text and handwritten text [21].…”

Section: State Of the Artmentioning

confidence: 99%

“…The calculation of such word probabilistic indexes, was developed as part of approaches to perform word-segmentation-free Key Word Spotting systems [2,14,21,28,29]. With these indexes we now have probabilistic information of the textual contents of the page, without the need to perform a detailed text base line detection, and performing full Handwritten Text Recognition on the detected text lines.…”

Section: Text Content Based Featuresmentioning

confidence: 99%

See 1 more Smart Citation

Advances in Document Layout Analysis

Campos¹

View full text Add to dashboard Cite

Handwritten Text Segmentation (HTS) is a task within the Document Layout Analysis field that aims to detect and extract the different page regions of interest found in handwritten documents. HTS remains an active topic, that has gained importance with the years, due to the increasing demand to provide textual access to the myriads of handwritten document collections held by archives and libraries.This thesis considers HTS as a task that must be tackled in two specialized phases: detection and extraction. We see the detection phase fundamentally as a recognition problem that yields the vertical positions of each region of interest as a by-product. The extraction phase consists in calculating the best contour coordinates of the region using the position information provided by the detection phase.Our proposed detection approach allows us to attack both higher level regions: paragraphs, diagrams, etc., and lower level regions like text lines. In the case of text line detection we model the problem to ensure that the system's yielded vertical position approximates the fictitious line that connects the lower part of the grapheme bodies in a text line, commonly known as the baseline.

show abstract

A Probabilistic Formulation of Keyword Spotting

Cited by 8 publications

References 123 publications

A limited-size ensemble of homogeneous CNN/LSTMs for high-performance word classification

A limited-size ensemble of homogeneous CNN/LSTMs for high-performance word classification

Reading order detection on handwritten documents

Advances in Document Layout Analysis

Contact Info

Product

Resources

About