2022
DOI: 10.3390/math11010177
|View full text |Cite
|
Sign up to set email alerts
|

An End-to-End Formula Recognition Method Integrated Attention Mechanism

Abstract: Formula recognition is widely used in document intelligent processing, which can significantly shorten the time for mathematical formula input, but the accuracy of traditional methods could be higher. In order to solve the complexity of formula input, an end-to-end encoder-decoder framework with an attention mechanism is proposed that converts formulas in pictures into LaTeX sequences. The Vision Transformer (VIT) is employed as the encoder to convert the original input picture into a set of semantic vectors. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(13 citation statements)
references
References 25 publications
0
3
0
Order By: Relevance
“…In recent years, these systems have evolved rapidly through deep learning. Usually, transformer-based approaches [2][3][4] have proven to outperform traditional statistical models [15,16] and convolutional neural networks [5,6,[17][18][19]. These neural networks are able to learn and recognize intricate patterns and features within images automatically, making them particularly well-suited for accurately extracting text with subscripts such as mathematical formulas from scanned documents or images [20].…”
Section: Mathematical Expression Recognitionmentioning
confidence: 99%
See 2 more Smart Citations
“…In recent years, these systems have evolved rapidly through deep learning. Usually, transformer-based approaches [2][3][4] have proven to outperform traditional statistical models [15,16] and convolutional neural networks [5,6,[17][18][19]. These neural networks are able to learn and recognize intricate patterns and features within images automatically, making them particularly well-suited for accurately extracting text with subscripts such as mathematical formulas from scanned documents or images [20].…”
Section: Mathematical Expression Recognitionmentioning
confidence: 99%
“…Figure 1 illustrates the two possible processes for transcribing mathematical formulas. In the first approach, the image of the formula is converted into LaTeX code using mathematical expression recognition (MER) [2][3][4][5][6]. This LaTeX code then serves as input for a natural language processing (NLP) model, which produces the transcription of the formula.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Histogram equalization techniques, such as adaptive histogram equalization (AHE) and contrast-limited adaptive histogram equalization (CLAHE), have been widely used. Recent research has explored integrating deep learning models, such as U-Net and Pix2Pix networks, for adaptive contrast enhancement [25]. These approaches have demonstrated their effectiveness in handling varying illumination conditions and improving OCR performance.…”
Section: Preprocessing Techniques For Image Enhancementmentioning
confidence: 99%
“…Binarization and thresholding techniques are also critical in OCR preprocessing. Binarization converts grayscale or color images into binary representations, separating foreground characters from the background [25]. Various thresholding techniques, including global thresholding, local adaptive thresholding, and hybrid methods, have been proposed to address different image characteristics and challenges.…”
Section: Preprocessing Techniques For Image Enhancementmentioning
confidence: 99%