ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8683208
|View full text |Cite
|
Sign up to set email alerts
|

DOD-CNN: Doubly-injecting Object Information for Event Recognition

Abstract: Recognizing an event in an image can be enhanced by detecting relevant objects in two ways: 1) indirectly utilizing object detection information within the unified architecture or 2) directly making use of the object detection output results. We introduce a novel approach, referred to as Doubly-injected Object Detection CNN (DOD-CNN), exploiting the object information in both ways for the task of event recognition. The structure of this network is inspired by the Integrated Object Detection CNN (IOD-CNN) where… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
3

Relationship

5
2

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 19 publications
0
4
0
Order By: Relevance
“…Since the Salinas dataset (S) or the Pavia University (PU) has much more data, it requires more iterations than the others. To cope with this issue, we adopt a two-step optimization strategy introduced in [49], [50], [58]. Under this scheme, the model is initially trained on only the largest dataset (Step I (S or PU)) and then is updated using the whole dataset (Step II).…”
Section: Optimizationmentioning
confidence: 99%
“…Since the Salinas dataset (S) or the Pavia University (PU) has much more data, it requires more iterations than the others. To cope with this issue, we adopt a two-step optimization strategy introduced in [49], [50], [58]. Under this scheme, the model is initially trained on only the largest dataset (Step I (S or PU)) and then is updated using the whole dataset (Step II).…”
Section: Optimizationmentioning
confidence: 99%
“…Two streams are integrated into both intermediate and last layers. Eum et al [21], Lee et al [22], Dai et al [23], Lee et al [24], and Lee et al [25] integrate different machine learning tasks such as object detection, event recognition, and semantic segmentation in a unified convolutional neural network architecture. However, in general, it is not feasible to model the mutual dependencies by fusing multiple approaches built on different principles of extracting or processing attributes or features that represent input data.…”
Section: Related Workmentioning
confidence: 99%
“…Architecture DOD-CNN. DOD-CNN [14] consists of five shared convolutional layers (C 1 , · · · , C 5 ), one RoI pooling layer, and three separate modules, each responsible for event recognition, rigid object detection, and non-rigid object detection, respectively. Each module consists of two convolutional layers (C 6 , C 7 ), one average pooling layer (AV G), and one fully connected layer (F C), where the output dimension of the last layer is set to match the number of events or objects.…”
Section: S-dod-cnnmentioning
confidence: 99%