DOD-CNN: Doubly-injecting Object Information for Event Recognition

Lee, Hyungtae; Eum, Sungmin; Kwon, Heesung

doi:10.1109/icassp.2019.8683208

Cited by 7 publications

(4 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since the Salinas dataset (S) or the Pavia University (PU) has much more data, it requires more iterations than the others. To cope with this issue, we adopt a two-step optimization strategy introduced in [49], [50], [58]. Under this scheme, the model is initially trained on only the largest dataset (Step I (S or PU)) and then is updated using the whole dataset (Step II).…”

Section: Optimizationmentioning

confidence: 99%

Exploring Cross-Domain Pretrained Model for Hyperspectral Image Classification

Lee,

Eum,

Kwon

2022

Preprint

Self Cite

View full text Add to dashboard Cite

A pretrain-finetune strategy is widely used to reduce the overfitting that can occur when data is insufficient for CNN training. First few layers of a CNN pretrained on a large-scale RGB dataset are capable of acquiring general image characteristics which are remarkably effective in tasks targeted for different RGB datasets. However, when it comes down to hyperspectral domain where each domain has its unique spectral properties, the pretrain-finetune strategy no longer can be deployed in a conventional way while presenting three major issues: 1) inconsistent spectral characteristics among the domains (e.g., frequency range), 2) inconsistent number of data channels among the domains, and 3) absence of large-scale hyperspectral dataset.We seek to train a universal cross-domain model which can later be deployed for various spectral domains. To achieve, we physically furnish multiple inlets to the model while having a universal portion which is designed to handle the inconsistent spectral characteristics among different domains. Note that only the universal portion is used in the finetune process. This approach naturally enables the learning of our model on multiple domains simultaneously which acts as an effective workaround for the issue of the absence of large-scale dataset.We have carried out a study to extensively compare models that were trained using cross-domain approach with ones trained from scratch. Our approach was found to be superior both in accuracy and in training efficiency. In addition, we have verified that our approach effectively reduces the overfitting issue, enabling us to deepen the model up to 13 layers (from 9) without compromising the accuracy.

show abstract

Section: Optimizationmentioning

confidence: 99%

Exploring Cross-Domain Pretrained Model for Hyperspectral Image Classification

Lee,

Eum,

Kwon

2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Two streams are integrated into both intermediate and last layers. Eum et al [21], Lee et al [22], Dai et al [23], Lee et al [24], and Lee et al [25] integrate different machine learning tasks such as object detection, event recognition, and semantic segmentation in a unified convolutional neural network architecture. However, in general, it is not feasible to model the mutual dependencies by fusing multiple approaches built on different principles of extracting or processing attributes or features that represent input data.…”

Section: Related Workmentioning

confidence: 99%

DBF: Dynamic Belief Fusion for Combining Multiple Object Detectors

Lee,

Kwon

2022

Preprint

Self Cite

View full text Add to dashboard Cite

In this paper, we propose a novel and highly practical score-level fusion approach called dynamic belief fusion (DBF ) that directly integrates inference scores of individual detections from multiple object detection methods. To effectively integrate the individual outputs of multiple detectors, the level of ambiguity in each detection score is estimated using a confidence model built on a precision-recall relationship of the corresponding detector. For each detector output, DBF then calculates the probabilities of three hypotheses (target, non-target, and intermediate state (target or non-target)) based on the confidence level of the detection score conditioned on the prior confidence model of individual detectors, which is referred to as basic probability assignment. The probability distributions over three hypotheses of all the detectors are optimally fused via the Dempster's combination rule. Experiments on the ARL, PASCAL VOC 07, and 12 datasets show that the detection accuracy of the DBF is significantly higher than any of the baseline fusion approaches as well as individual detectors used for the fusion.

show abstract

“…Architecture DOD-CNN. DOD-CNN [14] consists of five shared convolutional layers (C 1 , · · · , C 5 ), one RoI pooling layer, and three separate modules, each responsible for event recognition, rigid object detection, and non-rigid object detection, respectively. Each module consists of two convolutional layers (C 6 , C 7 ), one average pooling layer (AV G), and one fully connected layer (F C), where the output dimension of the last layer is set to match the number of events or objects.…”

Section: S-dod-cnnmentioning

confidence: 99%

S-DOD-CNN: Doubly Injecting Spatially-Preserved Object Information for Event Recognition

Lee

Eum

Kwon

2020

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

We present a novel event recognition approach called Spatiallypreserved Doubly-injected Object Detection CNN (S-DOD-CNN), which incorporates the spatially preserved object detection information in both a direct and an indirect way. Indirect injection is carried out by simply sharing the weights between the object detection modules and the event recognition module. Meanwhile, our novelty lies in the fact that we have preserved the spatial information for the direct injection. Once multiple regions-of-intereset (RoIs) are acquired, their feature maps are computed and then projected onto a spatially-preserving combined feature map using one of the four RoI Projection approaches we present. In our architecture, combined feature maps are generated for object detection which are directly injected to the event recognition module. Our method provides the state-of-the-art accuracy for malicious event recognition.

show abstract

DOD-CNN: Doubly-injecting Object Information for Event Recognition

Cited by 7 publications

References 19 publications

Exploring Cross-Domain Pretrained Model for Hyperspectral Image Classification

Exploring Cross-Domain Pretrained Model for Hyperspectral Image Classification

DBF: Dynamic Belief Fusion for Combining Multiple Object Detectors

S-DOD-CNN: Doubly Injecting Spatially-Preserved Object Information for Event Recognition

Contact Info

Product

Resources

About