2016
DOI: 10.1109/tpami.2015.2491929
|View full text |Cite
|
Sign up to set email alerts
|

HCP: A Flexible CNN Framework for Multi-Label Image Classification

Abstract: Convolutional Neural Network (CNN) has demonstrated promising performance in single-label image classification tasks. However, how CNN best copes with multi-label images still remains an open problem, mainly due to the complex underlying object layouts and insufficient multi-label training images. In this work, we propose a flexible deep CNN infrastructure, called Hypotheses-CNN-Pooling (HCP), where an arbitrary number of object segment hypotheses are taken as the inputs, then a shared CNN is connected with ea… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
303
1
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
3

Relationship

2
7

Authors

Journals

citations
Cited by 602 publications
(322 citation statements)
references
References 34 publications
1
303
1
1
Order By: Relevance
“…Instead of highly utilize the logistic-regression loss or softmax [23] for CNN-based classification, the squared L2 loss function is used [24], [25], [26] and there are many number of methods are available for regression label in the image. There is one direct method is present to measure the total count of pixel of ILD pattern per disease to represents its level of disease [27].…”
Section: Cnn Architecturementioning
confidence: 99%
“…Instead of highly utilize the logistic-regression loss or softmax [23] for CNN-based classification, the squared L2 loss function is used [24], [25], [26] and there are many number of methods are available for regression label in the image. There is one direct method is present to measure the total count of pixel of ILD pattern per disease to represents its level of disease [27].…”
Section: Cnn Architecturementioning
confidence: 99%
“…CNN-based methods follow the great success of Convolutional Neural Network in other vision tasks, [23], [24], [25], [26], especially semantic segmentation [27], [28], [29]. They leverage the powerful discrimination ability of Convolutional Neural Network (CNN) to extract visual features as inputs of other techniques to produce proposals or directly regress the coordinates of all the object bounding boxes in an image.…”
Section: Related Workmentioning
confidence: 99%
“…Its efficiency and high detection rates make BING a good choice in a large number of successful applications that require category independent object proposals [23][24][25][26][27][28][29]. Recently, deep neural network based object proposal generation methods have become very popular due to their high recall and computational efficiency, e.g., RPN [30], YOLO900 [31], and SSD [32].…”
Section: Introductionmentioning
confidence: 99%
“…Its poor generalization ability has restricted its usage, so RPN is usually only used in object detection. In comparison, BING is based on lowlevel cues concerning enclosing boundaries and thus can produce category independent object proposals, which has demonstrated applications in multi-label image classification [23], semantic segmentation [25], video classification [24], co-salient object detection [29], deep multi-instance learning [26], and video summarisation [27]. However, several researchers [34][35][36][37] have noted that BING's proposal localization is weak.…”
Section: Introductionmentioning
confidence: 99%