Falk Böschen scite author profile

Abstract. So far, there has not been a comparative evaluation of different approaches for text extraction from scholarly figures. In order to fill this gap, we have defined a generic pipeline for text extraction that abstracts from the existing approaches as documented in the literature. In this paper, we use this generic pipeline to systematically evaluate and compare 32 configurations for text extraction over four datasets of scholarly figures of different origin and characteristics. In total, our experiments have been run over more than 400 manually labeled figures. The experimental results show that the approach BS-4OS results in the best F-measure of 0.67 for the Text Location Detection and the best average Levenshtein Distance of 4.71 between the recognized text and the gold standard on all four datasets using the Ocropy OCR engine.

show abstract

Text Localization in Scientific Figures using Fully Convolutional Neural Networks on Limited Training Data

Jessen

Böschen

Scherp

2019

View full text Add to dashboard Cite

What to Read Next? Challenges and Preliminary Results in Selecting Representative Documents

Beck

Böschen

Scherp

2018

View full text Add to dashboard Cite

Abstract. The vast amount of scientific literature poses a challenge when one is trying to understand a previously unknown topic. Selecting a representative subset of documents that covers most of the desired content can solve this challenge by presenting the user a small subset of documents. We build on existing research on representative subset extraction and apply it in an information retrieval setting. Our document selection process consists of three steps: computation of the document representations, clustering, and selection of documents. We implement and compare two different document representations, two different clustering algorithms, and three different selection methods using a coverage and a redundancy metric. We execute our 36 experiments on two datasets, with 10 sample queries each, from different domains. The results show that there is no clear favorite and that we need to ask the question whether coverage and redundancy are sufficient for evaluating representative subsets.

show abstract

Evaluation of the Comprehensiveness of Bar Charts with and without Stacking Functionality using Eye-Tracking

Böschen

Strobel

Goos

et al. 2017

View full text Add to dashboard Cite

Bar charts are widely used to visualize core results of experiments in research papers or display statistics in news, media, and other reports. However, visualizations like bar charts are mostly manually designed, static presentations of data without the option of adaption to a user's needs. But so far, it is unknown whether interactivity improves the understanding of charts. In this work, we compare static with dynamic bar charts, which offer an interactive stacking option. We assess the efficiency, effectiveness, and satisfaction when answering questions regarding the content of a bar chart. An eye-tracker is used to measure the efficiency. We have conducted a between group experiment with 38 participants. While one group had to solve the aggregation tasks using stackable, i. e., interactive bar charts, the other group was limited to static visualizations. Even though new interactive features require familiarization, we found that the stacking feature significantly helps completing the tasks with respect to efficiency, effectiveness, and satisfaction for bar charts of varying complexity.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Falk Böschen

Multi-oriented Text Extraction from Information Graphics

A Comparison of Approaches for Automated Text Extraction from Scholarly Figures

Text Localization in Scientific Figures using Fully Convolutional Neural Networks on Limited Training Data

What to Read Next? Challenges and Preliminary Results in Selecting Representative Documents

Evaluation of the Comprehensiveness of Bar Charts with and without Stacking Functionality using Eye-Tracking

Contact Info

Product

Resources

About