Improving Small Object Proposals for Company Logo Detection

Eggert, Christian; Zecha, Dan; Brehm, Stephan; Lienhart, Rainer

doi:10.1145/3078971.3078990

Cited by 66 publications

(38 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is essential to determine what size of the anchor box is suitable for each scale of the network. Inspired by the method of the proposal generation [33], we used mathematical derivation based on IOU to help select the appropriate size of the anchor boxes for each scale.…”

Section: Appropriate Size For Anchor Boxesmentioning

confidence: 99%

The Application of Improved YOLO V3 in Multi-Scale Target Detection

Luo

Wang

et al. 2019

Applied Sciences

View full text Add to dashboard Cite

Target detection is one of the most important research directions in computer vision. Recently, a variety of target detection algorithms have been proposed. Since the targets have varying sizes in a scene, it is essential to be able to detect the targets at different scales. To improve the detection performance of targets with different sizes, a multi-scale target detection algorithm was proposed involving improved YOLO (You Only Look Once) V3. The main contributions of our work include: (1) a mathematical derivation method based on Intersection over Union (IOU) was proposed to select the number and the aspect ratio dimensions of the candidate anchor boxes for each scale of the improved YOLO V3; (2) To further improve the detection performance of the network, the detection scales of YOLO V3 have been extended from 3 to 4 and the feature fusion target detection layer downsampled by 4× is established to detect the small targets; (3) To avoid gradient fading and enhance the reuse of the features, the six convolutional layers in front of the output detection layer are transformed into two residual units. The experimental results upon PASCAL VOC dataset and KITTI dataset show that the proposed method has obtained better performance than other state-of-the-art target detection algorithms.

show abstract

Section: Appropriate Size For Anchor Boxesmentioning

confidence: 99%

The Application of Improved YOLO V3 in Multi-Scale Target Detection

Luo

Wang

et al. 2019

Applied Sciences

View full text Add to dashboard Cite

show abstract

“…This deletes most of the cobbles and fine boulder information from the data if large scale mosaics were fed directly into the network. Hence, the smaller tiles exported from the backscatter mosaics were upscaled to values between 300 and 1200 pixels, which is the simplest approach to facilitate small object detection [30,31]. The size of anchor boxes used to determine the bounding box of objects were left at their standard settings of 32, 64, 128, For classification and object detection, we use an open source RetinaNet [27] implementation in Python, available on GitHub (https://github.com/fizyr/keras-retinanet, last accessed on 6 February 2019).…”

Section: Preparation Of Train Validation and Test Datasetsmentioning

confidence: 99%

“…Therefore, it is mandatory to detect objects of the smallest possible size and to consider the minimum object size detectable by the trained models. The minimum size of objects whose detection can be trained by RetinaNet depends on a) the resolution of the input backscatter mosaic and b) the minimum anchor box of the network measured in pixels multiplied by the threshold of areal overlap of 0.5 required for a positive training [30,31]. For a minimum anchor box of 32 pixels, this results in a theoretical minimum threshold for positive training of 23 × 23 pixels.…”

Section: Constraining the Minimum Size Of Detected Bouldersmentioning

confidence: 99%

“…However, the bounding box threshold criterion of 0.5 may still be missed for these size classes in case of different alignment of bounding boxes. In addition, it needs to be considered that manually digitized bounding boxes are necessarily inaccurate at the pixel-level and may not be reproduced by the model, and errors on bounding boxes effect smaller objects more than larger objects [30]. For 7 × 7 pixel bounding boxes (3.1 m 2 at 0.25 m resolution), a detection appears feasible following the upscaling, as can be observed by the increased detection frequency by the 25 m 2 model rivaling the human interpreter ( Figure 5).…”

Section: Constraining the Minimum Size Of Detected Bouldersmentioning

confidence: 99%

See 1 more Smart Citation

Detection of Boulders in Side Scan Sonar Mosaics by a Neural Network

Feldens

Darr

Feldens³

et al. 2019

Geosciences

View full text Add to dashboard Cite

Boulders provide ecologically important hard grounds in shelf seas, and form protected habitats under the European Habitats Directive. Boulders on the seafloor can usually be recognized in backscatter mosaics due to a characteristic pattern of high backscatter intensity followed by an acoustic shadow. The manual identification of boulders on mosaics is tedious and subjective, and thus could benefit from automation. In this study, we train an object detection framework, RetinaNet, based on a neural network backbone, ResNet, to detect boulders in backscatter mosaics derived from a sidescan-sonar operating at 384 kHz. A training dataset comprising 4617 boulders and 2005 negative examples similar to boulders was used to train RetinaNet. The trained model was applied to a test area located in the Kriegers Flak area (Baltic Sea), and the results compared to mosaic interpretation by expert analysis. Some misclassification of water column noise and boundaries of artificial plough marks occurs, but the results of the trained model are comparable to the human interpretation. While the trained model correctly identified a higher number of boulders, the human interpreter had an advantage at recognizing smaller objects comprising a bounding box of less than 7 × 7 pixels. Almost identical performance between the best model and expert analysis was found when classifying boulder density into three classes (0, 1-5, more than 5) over 10,000 m 2 areas, with the best performing model reaching an agreement with the human interpretation of 90%.

show abstract

“…The AlexNet (Krizhevsky, Sutskever, & Hinton, 2012), VGG (Simonyan, & Zisserman, 2014), GoogLeNet (Szegedy et al, 2015), and ALL-CNN (Springenberg, Dosovitskiy, Brox, & Riedmiller, 2014) incorporated deep structures with variants of layers and have achieved remarkable performance in the classification of a large number of categories. In terms of image detection, the Faster region-based CNN (Faster R-CNN) (Ren, He, Girshick, & Sun, 2015) has been proved as an efficient detection method for small object, such as ship detection from SAR images (Kang, Leng, Lin, & Ji, 2017), company logo detection from real-world images (Eggert, Zecha, Brehm, & Lienhart, 2017), cancer cell detection (Zhang, Hu, Chen, Huang, & Guan, 2016) and gland instance detection (Xu et al, 2017) from microscopic images. Until now, relatively few studies were performed to identify M. tuberculosis from microscopic images using CNNs, except for chest X-ray TB image studies (Cao et al, 2016;Silva, Silva, Pinho, & Costa, 2017).…”

Section: Introductionmentioning

confidence: 99%

An effective and accurate identification system of Mycobacterium tuberculosis using convolution neural networks

Kuok

Horng

Liao

et al. 2019

Microscopy Res & Technique

View full text Add to dashboard Cite

Tuberculosis (TB) remains the leading cause of morbidity and mortality from infectious disease in developing countries. The sputum smear microscopy remains the primary diagnostic laboratory test. However, microscopic examination is always time‐consuming and tedious. Therefore, an effective computer‐aided image identification system is needed to provide timely assistance in diagnosis. The current identification system usually suffers from complex color variations of the images, resulting in plentiful of false object detection. To overcome the dilemma, we propose a two‐stage Mycobacterium tuberculosis identification system, consisting of candidate detection and classification using convolution neural networks (CNNs). The refined Faster region‐based CNN was used to distinguish candidates of M. tuberculosis and the actual ones were classified by utilizing CNN‐based classifier. We first compared three different CNNs, including ensemble CNN, single‐member CNN, and deep CNN. The experimental results showed that both ensemble and deep CNNs were on par with similar identification performance when analyzing more than 19,000 images. A much better recall value was achieved by using our proposed system in comparison with conventional pixel‐based support vector machine method for M. tuberculosis bacilli detection.

show abstract

Improving Small Object Proposals for Company Logo Detection

Cited by 66 publications

References 18 publications

The Application of Improved YOLO V3 in Multi-Scale Target Detection

The Application of Improved YOLO V3 in Multi-Scale Target Detection

Detection of Boulders in Side Scan Sonar Mosaics by a Neural Network

An effective and accurate identification system of Mycobacterium tuberculosis using convolution neural networks

Contact Info

Product

Resources

About