Are 2D-LSTM really dead for offline text recognition?

Moysset, Bastien; Messina, Ronaldo

doi:10.1007/s10032-019-00325-0

Cited by 26 publications

(16 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Training time Prediction time Parameters (min/epoch) (ms/sample) 2D-LSTM [6] 0.8 M 2D-LSTM-X2 [6] 3.3 M CNN + 1D-LSTM [5,6] 11 As we can see among the best models, ours is the one with the lowest number of parameters. Training time and prediction time of the CNN + 1D-LSTM [5,6] and of our model are in the same order of magnitude. This can be explained by the high number of normalization layers used in our model and its depth that counterbalance with the sequential computations of the LSTM layers.…”

Section: Architecturementioning

confidence: 96%

“…In [5], a 1D-LSTM reaches better results than a 2D-LSTM with less training time but more parameters. More recently, a 2D-LSTM presented in [6] showed competitive prediction time and performance over many datasets such as RIMES [7], IAM [8] and more complex ones like MAURDOR [9].…”

Section: A Recurrent Neural Network (Rnn)mentioning

confidence: 99%

“…Training time Prediction time Parameters (min/epoch) (ms/sample) 2D-LSTM [6] 0.8 M 2D-LSTM-X2 [6] 3.3 M CNN + 1D-LSTM [5,6] 11.25 57 9.6 M Ours 13.75 74 1.4 M Table IV: Training time, prediction time and number of parameters of different architectures seen in the previous experiments for the IAM dataset with image height of 128px, preserving the original width.…”

Section: Architecturementioning

confidence: 99%

See 2 more Smart Citations

Recurrence-free unconstrained handwritten text recognition using gated fully convolutional network

Coquenet

Chatelain

Paquet

2020

2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR)

View full text Add to dashboard Cite

Unconstrained handwritten text recognition is a major step in most document analysis tasks. This is generally processed by deep recurrent neural networks and more specifically with the use of Long Short-Term Memory cells. The main drawbacks of these components are the large number of parameters involved and their sequential execution during training and prediction. One alternative solution to using LSTM cells is to compensate the long time memory loss with an heavy use of convolutional layers whose operations can be executed in parallel and which imply fewer parameters. In this paper we present a Gated Fully Convolutional Network architecture that is a recurrence-free alternative to the well-known CNN+LSTM architectures. Our model is trained with the CTC loss and shows competitive results on both the RIMES and IAM datasets. We release all code to enable reproduction of our experiments: https://github.com/FactoDeepLearning/LinePytorchOCR.

show abstract

Section: Architecturementioning

confidence: 96%

Section: A Recurrent Neural Network (Rnn)mentioning

confidence: 99%

Section: Architecturementioning

confidence: 99%

See 1 more Smart Citation

Recurrence-free unconstrained handwritten text recognition using gated fully convolutional network

Coquenet

Chatelain

Paquet

2020

2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR)

View full text Add to dashboard Cite

show abstract

“…Recently, there has been a trend in handwritten text recognition with deep neural networks to replace 2D recurrent layers with 1D, and in some cases even completely remove the recurrent layers, relying on simple feed-forward convolutional only architectures. A more detailed discussion of that can be found in the 2018 paper of Moysset and Messina (2017). On the other hand, those two authors show that 2D-LSTM networks still seem to provide the highest performances.…”

Section: Literature Reviewmentioning

confidence: 99%

Neural networks and arbitrage in the VIX

Osterrieder

Kucharczyk

Rudolf³

et al. 2020

Digit Finance

View full text Add to dashboard Cite

The Chicago Board Options Exchange Volatility Index (VIX) is considered by many market participants as a common measure of market risk and investors’ sentiment, representing the market’s expectation of the 30-day-ahead looking implied volatility obtained from real-time prices of options on the S&P 500 index. While smaller deviations between implied and realized volatility are a well-known stylized fact of financial markets, large, time-varying differences are also frequently observed throughout the day. Furthermore, substantial deviations between the VIX and its futures might lead to arbitrage opportunities on the VIX market. Arbitrage is hard to exploit as the potential strategy to exploit it requires buying several hundred, mostly illiquid, out-of-the-money (put and call) options on the S&P 500 index. This paper discusses a novel approach to predicting the VIX on an intraday scale by using just a subset of the most liquid options. To the best of the authors’ knowledge, this the first paper, that describes a new methodology on how to predict the VIX (to potentially exploit arbitrage opportunities using VIX futures) using most recently developed machine learning models to intraday data of S&P 500 options and the VIX. The presented results are supposed to shed more light on the underlying dynamics in the options markets, help other investors to better understand the market and support regulators to investigate market inefficiencies.

show abstract

“…These models are configured to minimize the connectionist temporal classification (CTC) cost function proposed by Graves in [17]. In some works 2D-LSTM [16] networks are used [18]- [21]. This RNN has two main drawbacks.…”

Section: Related Work and Contributions A Dnn Modelmentioning

confidence: 99%

Boosting Offline Handwritten Text Recognition in Historical Documents With Few Labeled Lines

2021

View full text Add to dashboard Cite

In this paper we address the problem of offline handwritten text recognition (HTR) in historical documents when few labeled samples are available and some of them contain errors in the train set. Our three main contributions are: first, we analyze how to perform transfer learning (TL) from a massive database to a smaller historical database, analyzing which layers of the model need fine-tuning. Second, we analyze methods to efficiently combine TL and data augmentation (DA). Finally, we propose an algorithm to mitigate the effects of incorrect labeling in the training set. The methods are analyzed over the ICFHR 2018 competition database, Washington and Parzival. Combining all these techniques, we demonstrate a remarkable reduction of CER (up to 6 percentage points in some cases) in the test set with little complexity overhead.INDEX TERMS connectionist temporal classification (CTC), convolutional neural networks (CNN), data augmentation (DA), deep neural networks (DNN), historical documents, long-short-term-memory (LSTM), offline handwriting text recognition (HTR), outlier detection; transfer learning.

show abstract

Are 2D-LSTM really dead for offline text recognition?

Cited by 26 publications

References 27 publications

Recurrence-free unconstrained handwritten text recognition using gated fully convolutional network

Recurrence-free unconstrained handwritten text recognition using gated fully convolutional network

Neural networks and arbitrage in the VIX

Boosting Offline Handwritten Text Recognition in Historical Documents With Few Labeled Lines

Contact Info

Product

Resources

About