HAMEX - A Handwritten and Audio Dataset of Mathematical Expressions

Quiniou, Solen; Mouchère, Harold; Saldarriaga, Sebasti ́n Pen; Viard-Gaudin, Christian; Morin, Emmanuel; Petitrenaud, Simon; Medjkoune, Sofiane

doi:10.1109/icdar.2011.97

Cited by 25 publications

(15 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Some of the combination techniques proposed in our research are adaptations of well known methods such as weighted combination, Borda count [54], and other decision combination techniques [18]. In the context of handwriting and speech recognition, classifier combination techniques have also been used to improve the recognition accuracy of handwriting recognizers [14,41,48,59] as well as speech recognizers [12].…”

Section: Classifier Combinationmentioning

confidence: 99%

“…However, issues such as ambiguity detection and A/V synchronization are not considered in the aforementioned research. Another relevant research effort, closely tied to [33], relates to the creation of a data set with handwritten and spoken mathematical content [41]. Unfortunately, this data set consists of static image segments containing handwritten content, with the corresponding audio stored in separate files.…”

Section: Audio-video Based Content Recognitionmentioning

confidence: 99%

See 1 more Smart Citation

Audio-video based character recognition for handwritten mathematical content in classroom videos

Vemulapalli

Hayes

2014

ICA

View full text Add to dashboard Cite

Recognizing handwritten equations is a challenging problem, and even more so when they are written in a classroom environment. However, since videos of the handwritten text and the accompanying audio refer to the same content, a combination of video and audio based recognition has the potential to significantly improve the recognition accuracy. In this paper, using a combination of video and audio based recognizers, we focus on improving the character recognition accuracy for handwritten mathematical content in videos using audio and propose an end-to-end recognition system. The system includes components for video preprocessing, selecting the characters that may benefit from audio-video based combination, establishing a correspondence between handwritten and the spoken content, and finally combining the recognition results from the audio and video based recognizers. The current implementation of the system makes use of a modified open source text recognizer and a commercially available phonetic word spotter. For evaluation purposes, we use videos recorded in a classroom-like environment and our experiments demonstrate the significant improvements in character recognition accuracy that can be achieved using our techniques.

show abstract

Section: Classifier Combinationmentioning

confidence: 99%

Section: Audio-video Based Content Recognitionmentioning

confidence: 99%

Audio-video based character recognition for handwritten mathematical content in classroom videos

Vemulapalli

Hayes

2014

ICA

View full text Add to dashboard Cite

show abstract

“…Training and test data for CROHME 2012 along with related tools were available from the International Association of Pattern Recognition (IAPR) 4 . For Part 4, the training data includes thousands of expressions from existing handwritten expression datasets, including (i) MathBrush (University of Waterloo) [18], (ii) HAMEX (University of Nantes) [19], (iii) MfrDB (Czech Technical University) [5], (iv) ExpressMatch (University of Sao Paulo) [20] and (v) the KAIST dataset. Due to differences in legal symbols and layouts, not all expressions in these data sets were consistent with the Part 4 grammar.…”

Section: A Datasets and Expression Encodingsmentioning

confidence: 99%

The Problem of Handwritten Mathematical Expression Recognition Evaluation

Awal¹,

Mouchère²,

Viard-Gaudin³

2010

2010 12th International Conference on Frontiers in Handwriting Recognition

Self Cite

View full text Add to dashboard Cite

Abstract-We report on the third international Competition on Handwritten Mathematical Expression Recognition (CROHME), in which eight teams from academia and industry took part. For the third CROHME, the training dataset was expanded to over 8000 expressions, and new tools were developed for evaluating performance at the level of strokes as well as expressions and symbols. As an informal measure of progress, the performance of the participating systems on the CROHME 2012 data set is also reported. Data and tools used for the competition will be made publicly available.

show abstract

“…The presence of several accessible corpora for the recognition enable this domain and it is useful for many fields, such as the field of Latin mathematical formula recognition. This field presents a datasets that facilitates the progress of this domain like the HAMEX [4]. The HAMEX is a public dataset that contains mathematical expressions in their handwritten form and in their audio spoken form.…”

Section: Introductionmentioning

confidence: 99%

Database of Handwritten Arabic Mathematical Formula Images

Ali¹,

Mahjoub²

2016

2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV)

View full text Add to dashboard Cite

Although publicly available, ground-truthed database have proven useful for training, evaluating, and comparing recognition systems in many domains, the availability of such database for handwritten Arabic mathematical formula recognition in particular, is currently quite poor. In this paper, we present a new public database that contains off-line handwritten mathematical expressions. We describe in this paper the different steps to acquire this database, from the collection of the mathematical expression corpora to the transcription of the collected data. Actually, the database contains 4 238 off-line handwritten mathematical expressions written by 66 writers and 20 300 handwritten isolated symbol images. The ground truth is also presented for the handwritten expressions as XML files with the number of symbols, and the MATHML structure.

show abstract

HAMEX - A Handwritten and Audio Dataset of Mathematical Expressions

Cited by 25 publications

References 6 publications

Audio-video based character recognition for handwritten mathematical content in classroom videos

Audio-video based character recognition for handwritten mathematical content in classroom videos

The Problem of Handwritten Mathematical Expression Recognition Evaluation

Database of Handwritten Arabic Mathematical Formula Images

Contact Info

Product

Resources

About