Proceedings of the 4th Spanish Conference on Information Retrieval 2016
DOI: 10.1145/2934732.2934739
|View full text |Cite
|
Sign up to set email alerts
|

Automatic classification of web images as UML diagrams

Abstract: Our purpose in this research is to develop a methodology to automatically and efficiently classify web images as UML static diagrams, and to produce a computer tool that implements this function. The tool receives as input a bitmap file (in different formats) and tells whether the image corresponds to a diagram. The tool does not require that the images are explicitly or implicitly tagged as UML diagrams. The tool extracts graphical characteristics from each image (such as grayscale histogram, color histogram … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
10
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
3
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 10 publications
(10 citation statements)
references
References 14 publications
0
10
0
Order By: Relevance
“…First, we obtained a sample of nearly 19,000 images from the web (exactly 18,899 items) that resulted from queries in Google Images involving the terms "uml diagram"; then a team of experts manually classified the images as UML static diagrams (Yes/No). The reader is referred to our previous conference paper [9] for the details regarding the criteria followed in this manual classification. Second, we analyzed the main graphical characteristics of images that represent diagrams, such as grayscale histogram, color histogram, elementary geometric forms detected with image recognition techniques (especially rectangles), and so on.…”
Section: Motivation and Outline Of Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…First, we obtained a sample of nearly 19,000 images from the web (exactly 18,899 items) that resulted from queries in Google Images involving the terms "uml diagram"; then a team of experts manually classified the images as UML static diagrams (Yes/No). The reader is referred to our previous conference paper [9] for the details regarding the criteria followed in this manual classification. Second, we analyzed the main graphical characteristics of images that represent diagrams, such as grayscale histogram, color histogram, elementary geometric forms detected with image recognition techniques (especially rectangles), and so on.…”
Section: Motivation and Outline Of Methodsmentioning
confidence: 99%
“…In our previous conference paper [9], we explained with many examples how a small set of simple graphical features that are very easy to compute can be effectively and efficiently used to classify a given image as a diagram, so that we can avoid the more time consuming identification of other features, such as text or complex geometrical forms. These simple features include the number of different gray tones and colors (histograms), as well as the number of vertical and horizontal straight polylines (solid or dashed), a special case of which is the rectangle, by far the most common geometrical form in UML diagrams.…”
Section: Analysis Of Graphical Characteristics Of Diagrammatic Imagesmentioning
confidence: 99%
See 1 more Smart Citation
“…Moreno et al conducted a similar study to classify web images as UML and non-UML class diagrams using a rule based approach. By extracting features from the images, in a corpus of 19000 web images, their algorithm reached an accuracy of 95% [9].…”
Section: Related Workmentioning
confidence: 99%
“…In 2015 Hjaltason et al [19] utilized support vector machines (SVMs) trained on a corpus of 1300 UML and non-UML images, producing an average classification accuracy of 92.05%. Moreno et al addressed the high processing times in these previous approaches by proposing a rule based approach [20]. These rules were extracted from a corpus of nearly 19,000 web images (UML and non-UML), and used a training set of 715 images to identify UML images with a 95% accuracy.…”
Section: Related Workmentioning
confidence: 99%