2019
DOI: 10.1007/978-3-030-10925-7_2
|View full text |Cite
|
Sign up to set email alerts
|

Image-to-Markup Generation via Paired Adversarial Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 42 publications
(27 citation statements)
references
References 17 publications
0
19
0
Order By: Relevance
“…Our formula detector is based on graph-theoretic methods for determining the position of multi-character formulae, in combination with statistical and context-recognition-based approaches for detecting single-character mathematical symbols inside text [12]. [1,2] CASIA/NLPR PAL Group We entered two systems: PAL, and PAL-v2 which extends our previous work [13]. The attention-based encoder-decoder model in PAL-v2 is trained using official data only.…”
Section: Participating Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Our formula detector is based on graph-theoretic methods for determining the position of multi-character formulae, in combination with statistical and context-recognition-based approaches for detecting single-character mathematical symbols inside text [12]. [1,2] CASIA/NLPR PAL Group We entered two systems: PAL, and PAL-v2 which extends our previous work [13]. The attention-based encoder-decoder model in PAL-v2 is trained using official data only.…”
Section: Participating Methodsmentioning
confidence: 99%
“…We augment the training data set using rotations, perspective shift, distortion, and bevel, as well as the decomposition operation introduced by Le et al [8]. This expanded the training data to 330k images, which are then used for Paired Adversarial Learning [13]. An ensemble of 6 models with different initializations produced the PAL-V2 results.…”
Section: Participating Methodsmentioning
confidence: 99%
“…Extending traditional text recognition, some authors transform images of tables (Zhong et al, 2019;Deng et al, 2019) and mathematical formulas (Deng et al, 2017;Wu et al, 2018) into their LaTeX or HTML representations. After applying a convolutional encoder to the input image, they use a forward RNN based decoder to generate tokens in the target language.…”
Section: Structured Language Generationmentioning
confidence: 99%
“…[12] proposed a coarse-tofine attention to improve efficiency. In addition, [31] introduced a PAL model and employed an adversarial learning strategy during training.…”
Section: Attention Based Encoder-decoder Approaches For Hmermentioning
confidence: 99%
“…The system UPV denotes the best system in all submitted systems to CROHME 2014 competition while the system Wiris denotes the best system in all submitted systems to CROHME 2016 competition (only using official training dataset) and the details can be seen in [47,48]. The details of WYGIWYS and PAL can refer to [12] and [31], respectively. Please note that the results of the end-to-end approaches are not exactly comparable with traditional approaches in the submitted systems to CROHME competitions as the segmentation error is not explicitly considered.…”
Section: Evaluation Of Multi-modal Scan (Q2)mentioning
confidence: 99%