2020 IEEE International Conference on Image Processing (ICIP) 2020
DOI: 10.1109/icip40778.2020.9190864
|View full text |Cite
|
Sign up to set email alerts
|

Dronecaps: Recognition Of Human Actions In Drone Videos Using Capsule Networks With Binary Volume Comparisons

Abstract: Please refer to published version for the most recent bibliographic citation information. If a published version is known of, the repository item page linked to above, will contain details on accessing it.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…Our approach outperforms the previously existing methods. It successfully achieves state-ofthe-art results in all three datasets, namely 72.76% on the Okutama dataset, 92.56% on the MOD20 RoiAlign leads the network to focus on the key points where the action is currently taking place, and ignores the background noise, as opposed to the previously used 3D CNNs in [58]. It is also computationally less expensive, being a single stream network as compared to the approaches used in the previous works for MOD20 and Okutama dataset evaluation.…”
Section: Discussion and Comparisonmentioning
confidence: 91%
See 1 more Smart Citation
“…Our approach outperforms the previously existing methods. It successfully achieves state-ofthe-art results in all three datasets, namely 72.76% on the Okutama dataset, 92.56% on the MOD20 RoiAlign leads the network to focus on the key points where the action is currently taking place, and ignores the background noise, as opposed to the previously used 3D CNNs in [58]. It is also computationally less expensive, being a single stream network as compared to the approaches used in the previous works for MOD20 and Okutama dataset evaluation.…”
Section: Discussion and Comparisonmentioning
confidence: 91%
“…The state-of-the-art on MOD20 dataset [3] uses a two-stream approach and depends on motion-CNN for its accuracy. The state-of-the-art on Okutama dataset [58] uses features computed by 3D convolution neural networks plus a new set of features computed by a binary volume comparison (BVC) layer. BVC layer comprises three parts: a 3D-Conv layer with 12 non-trainable (i.e., fixed) filters, a non-linear function, and a set of learnable weights.…”
Section: Limitationsmentioning
confidence: 99%
“…It successfully achieves state-of-the-art results in all three datasets, namely 72.76% on the Okutama dataset, 92.56% on the MOD20 dataset, and 71.79% on the Drone Action dataset without pose-stream whereas 78.86% with pose-stream. For the Okutama dataset specifically, our Weighted Temporal Attention module with RoiAlign leads the network to focus on the keypoints where the action is currently taking place, and ignores the background noise, as opposed to the previously used 3D CNNs in [33]. It is also computationally less expensive, being a single stream network as compared to the approaches used in the previous works for MOD20 and Okutama dataset evaluation.…”
Section: Discussion and Comparisonmentioning
confidence: 99%
“…The state-of-theart on MOD20 dataset [3] uses a two-stream approach and depends on motion-CNN for their accuracy. The state-of-theart on Okutama-Action dataset [33] uses features computed by 3D convolution neural networks plus a new set of features computed by Binary Volume Comparison (BVC) layer, which comprises three parts: a 3D-Conv layer with 12 non-trainable (i.e., fixed) filters, a non-linear function and a set of learnable weights. Features from both the streams: 3D CNNs and BVC layer are concatenated and passed to Capsule Network for final activity prediction.…”
Section: Related Workmentioning
confidence: 99%
“…Ever since, different routing algorithms and architectures for capsule networks have been proposed and have found applications in various tasks [2,4,11,18]. We refer to routing-based CapsNets as those models that employ a routing algorithm in the architecture of the network.…”
Section: Literature Review and Considerations On The Routing Algorithmmentioning
confidence: 99%