IOD-CNN: Integrating object detection networks for event recognition

Eum, Sungmin; Lee, Hyungtae; Kwon, Heesung; Doermann, David

doi:10.1109/icip.2017.8296406

Cited by 15 publications

(13 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The next three rows show the classification accuracy based on a different baseline classifier ('Event CNN+'). This baseline also does not exploit any keyword information and is reported [5] to have used additional treatments such as an ROI pooling and a different training scheme. IOD-CNN [5] which embeds the keyword-driven object information by early-fusion outperforms its baseline ('Event CNN+') by 3.4 AP.…”

Section: Methodsmentioning

confidence: 99%

“…These "machine-driven" attention maps show that they clearly share high relevance with the "human-driven" semantic keywords although the classifier is not supported with any additional semantic information in the learning process. Lastly, we carry out a study to verify the practicality of explicitly incorporating these semantic keywords using various fusion approaches, which include a novel CNN-based architecture (IOD-CNN) developed by part of the authors [5]. We show that these keyword-driven information is effective in helping out the event classification task regardless of whether the information is used in an early-or late-fusion scheme.…”

Section: Introductionmentioning

confidence: 98%

See 1 more Smart Citation

Exploitation of Semantic Keywords for Malicious Event Classification

Lee

Eum

Levis

et al. 2018

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

Learning an event classifier is challenging when the scenes are semantically different but visually similar. However, as humans, we typically handle such tasks painlessly by adding our background semantic knowledge. Motivated by this observation, we aim to provide an empirical study about how additional information such as semantic keywords can boost up the discrimination of such events. To demonstrate the validity of this study, we first construct a novel Malicious Crowd Dataset containing crowd images with two events, benign and malicious, which look visually similar. Note that the primary focus of this paper is not to provide the state-ofthe-art performance on this dataset but to show the beneficial aspects of using semantically-driven keyword information. By leveraging crowd-sourcing platforms, such as Amazon Mechanical Turk, we collect semantic keywords associated with images and then subsequently identify a subset of keywords (e.g. police, fire, etc.) unique to specific events. We first show that by using recently introduced attention models, a naïve CNN-based event classifier actually learns to primarily focus on local attributes associated with the discriminant semantic keywords identified by the Turks. We further show that incorporating the keyword-driven information into earlyand late-fusion approaches can significantly enhance malicious event classification.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 98%

Exploitation of Semantic Keywords for Malicious Event Classification

Lee

Eum

Levis

et al. 2018

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Convolutional Neural Networks (CNN) have a high performance in computer vision and pattern recognition. Many approaches implemented them to tackle problems such as object detection [29,30], identifying actions in images [31], and text recognition [32]. Ji et al [33] proposed a three-dimensional convolution on a CNN (3DCNN) architecture to analyze video data.…”

Section: Related Workmentioning

confidence: 99%

Detecting Suspicious Behavior: How to Deal with Visual Similarity through Neural Networks

Martínez-Mascorro,

Ortiz-Bayliss,

Terashima-Marín

2020

Preprint

View full text Add to dashboard Cite

Suspicious behavior is likely to threaten security, assets, life, or freedom. This behavior has no particular pattern, which complicates the tasks to detect it and define it. Even for human observers, it is complex to spot suspicious behavior in surveillance videos. Some proposals to tackle abnormal and suspicious behavior-related problems are available in the literature. However, they usually suffer from high false-positive rates due to different classes with high visual similarity. The Pre-Crime Behavior method removes information related to a crime commission to focus on suspicious behavior before the crime happens. The resulting samples from different types of crime have a high-visual similarity with normal-behavior samples. To address this problem, we implemented 3D Convolutional Neural Networks and trained them under different approaches. Also, we tested different values in the number-of-filter parameter to optimize computational resources. Finally, the comparison between the performance using different training approaches shows the best option to improve the suspicious behavior detection on surveillance videos.

show abstract

“…Since the Pavia Center dataset has much more data than the others, it requires more iterations than the others. To cope with this issue, we adopt a two-step optimization strategy introduced in [7,8]. Under this scheme, the network is initially trained on only the largest dataset (Step I (PC)) and then is updated using the whole dataset for multi-task learning (Step II).…”

Section: Settingsmentioning

confidence: 99%

Is Pretraining Necessary for hyperspectral image classification?

Lee

Eum

Kwon

2019

IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium

Self Cite

View full text Add to dashboard Cite

We address two questions for training a convolutional neural network (CNN) for hyperspectral image classification: i) is it possible to build a pre-trained network? and ii) is the pretraining effective in furthering the performance? To answer the first question, we have devised an approach that pre-trains a network on multiple source datasets that differ in their hyperspectral characteristics and fine-tunes on a target dataset. This approach effectively resolves the architectural issue that arises when transferring meaningful information between the source and the target networks. To answer the second question, we carried out several ablation experiments. Based on the experimental results, a network trained from scratch performs as good as a network fine-tuned from a pre-trained network. However, we observed that pre-training the network has its own advantage in achieving better performances when deeper networks are required.

show abstract

IOD-CNN: Integrating object detection networks for event recognition

Cited by 15 publications

References 15 publications

Exploitation of Semantic Keywords for Malicious Event Classification

Exploitation of Semantic Keywords for Malicious Event Classification

Detecting Suspicious Behavior: How to Deal with Visual Similarity through Neural Networks

Is Pretraining Necessary for hyperspectral image classification?

Contact Info

Product

Resources

About