Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer

Zhao, Wenqi; Gao, Liangcai; Yan, Zuoyu; Peng, Shuping; Du, Lin; Zhang, Ziyin

doi:10.1007/978-3-030-86331-9_37

Cited by 53 publications

(40 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In order to be consistent with the reported inference process, DWAP-TD [42] and BTTR [46] use beam search, while DWAP [39] doesn't. Specifically, as shown in table 3, our method outperforms BTTR [46] by 1.6% on easy subset. However, as the difficulty of the test subset increases, the leading margin of our method increases to 5.5% on the hard subset.…”

Section: Comparisons With State-of-the-artsmentioning

confidence: 99%

“…Most HMER methods extensively adopt the sequence-to-sequence approach. The authors in [8,15,16,23,27,29,39,40,40,43,46] proposed an attention-based sequence-to-sequence model to convert the handwritten mathematical expression images into rep- resentational markup language LaTeX. Recently, Wu et al [31] designed a graph-to-graph(G2G) model that explores the HMEs structural relationship of the input formula and output markup, which significantly improve the performance.…”

Section: Related Workmentioning

confidence: 99%

“…otherwise Comparisons with previous methods. In this subsection, we compare our proposed method with DWAP [39], DWAP-TD [42] and BTTR [46] on HME100K dataset. In order to be consistent with the reported inference process, DWAP-TD [42] and BTTR [46] use beam search, while DWAP [39] doesn't.…”

Section: Comparisons With State-of-the-artsmentioning

confidence: 99%

“…In this subsection, we compare our proposed method with DWAP [39], DWAP-TD [42] and BTTR [46] on HME100K dataset. In order to be consistent with the reported inference process, DWAP-TD [42] and BTTR [46] use beam search, while DWAP [39] doesn't. Specifically, as shown in table 3, our method outperforms BTTR [46] by 1.6% on easy subset.…”

Section: Comparisons With State-of-the-artsmentioning

confidence: 99%

See 3 more Smart Citations

Syntax-Aware Network for Handwritten Mathematical Expression Recognition

Yuan¹,

Liu²,

Dikubab³

et al. 2022

Preprint

View full text Add to dashboard Cite

Handwritten mathematical expression recognition (HMER) is a challenging task that has many potential applications. Recent methods for HMER have achieved outstanding performance with an encoder-decoder architecture. However, these methods adhere to the paradigm that the prediction is made "from one character to another", which inevitably yields prediction errors due to the complicated structures of mathematical expressions or crabbed handwritings. In this paper, we propose a simple and efficient method for HMER, which is the first to incorporate syntax information into an encoder-decoder network. Specifically, we present a set of grammar rules for converting the LaTeX markup sequence of each expression into a parsing tree; then, we model the markup sequence prediction as a tree traverse process with a deep neural network. In this way, the proposed method can effectively describe the syntax context of expressions, alleviating the structure prediction errors of HMER. Experiments on three benchmark datasets demonstrate that our method achieves better recognition performance than prior arts. To further validate the effectiveness of our method, we create a largescale dataset consisting of 100k handwritten mathematical expression images acquired from ten thousand writers. The source code, new dataset † , and pre-trained models of this work will be publicly available.

show abstract

Section: Comparisons With State-of-the-artsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Comparisons With State-of-the-artsmentioning

confidence: 99%

Section: Comparisons With State-of-the-artsmentioning

confidence: 99%

See 2 more Smart Citations

Syntax-Aware Network for Handwritten Mathematical Expression Recognition

Yuan¹,

Liu²,

Dikubab³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…The transformer architecture [5] was first introduced in the context of machine translation. Since then, transformerbased models have proven their robustness through a wide variety of computer vision tasks such as images of mathematical expression recognition [6] or image classification [7]. They have also shown promising results for single text line recognition [8], [9] and scene text recognition [10].…”

mentioning

confidence: 99%

DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition

Coquenet¹,

Chatelain²,

Paquet³

2022

Preprint

View full text Add to dashboard Cite

Unconstrained handwritten document recognition is a challenging computer vision task. It is traditionally handled by a two-step approach combining line segmentation followed by text line recognition. For the first time, we propose an end-to-end segmentation-free architecture for the task of handwritten document recognition: the Document Attention Network. In addition to the text recognition, the model is trained to label text parts using begin and end tags in an XML-like fashion. This model is made up of an FCN encoder for feature extraction and a stack of transformer decoder layers for a recurrent token-by-token prediction process. It takes whole text documents as input and sequentially outputs characters, as well as logical layout tokens. Contrary to the existing segmentation-based approaches, the model is trained without using any segmentation label. We achieve competitive results on the READ dataset at page level, as well as double-page level with a CER of 3.53% and 3.69%, respectively. We also provide results for the RIMES dataset at page level, reaching 4.54% of CER. We provide all source code and pre-trained model weights at https://github.com/FactoDeepLearning/DAN.

show abstract

An Encoder-Decoder Approach to Offline Handwritten Mathematical Expression Recognition with Residual Attention

Lin

Wang

et al. 2022

Pattern Recognition and Artificial Intelligence

View full text Add to dashboard Cite

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer

Cited by 53 publications

References 27 publications

Syntax-Aware Network for Handwritten Mathematical Expression Recognition

Syntax-Aware Network for Handwritten Mathematical Expression Recognition

DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition

An Encoder-Decoder Approach to Offline Handwritten Mathematical Expression Recognition with Residual Attention

Contact Info

Product

Resources

About