Double Attention Based on Graph Attention Network for Image Multi-Label Classification

Zhou, Wei; Xia, Zhiwu; Dou, Peng; Su, Tao; Hu, Haifeng

doi:10.1145/3519030

Cited by 16 publications

(7 citation statements)

References 54 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Compared to the remaining methods, our method still maintains the optimal performance on 10 categories. Notably, our method performs well in the categories of boats, bottles, sofas, and TVs, achieving a significant improvement of nearly 1

\%

over COP [14], DER [15], CPCL [16], DA‐GAT [26], CANet [27], and IA‐GCN [28]. For the few categories where the prediction accuracy is not optimal, the difference between our method and the optimal value remains within a very small range.…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

MSFA: Multi‐stage feature aggregation network for multi‐label image recognition

Chen,

Xu,

Zeng

et al. 2024

IET Image Processing

View full text Add to dashboard Cite

Multi‐label image recognition (MLR) is a significant branch of image classification that aims to assign multiple categorical labels to each input. Previous research has focused on enhancing the learning of category‐related regional features. However, the potential impact of multi‐scale distributions in intra‐ and inter‐category targets on MLR tends to be neglected. Besides, semantic consistency for categories is restricted to be considered on single‐scale features, resulting in suboptimal feature extraction. To address the limitations of above, a Multi‐stage Feature Aggregation (MSFA) network is proposed. In MSFA, a novel local feature extraction method is suggested to progressively extract category‐related high‐resolution local features in both spatial and channel dimensions. Subsequently, local and global features are fused without additional up‐ and down‐sampling to enrich the scale diversity of the features while incorporating refined class‐specific information. Furthermore, a hierarchical prediction scheme for MLR is proposed, which generates classification confidence corresponding to different scales under hierarchical loss supervision. Consequently, the final output of the network comes from the joint prediction by the classifiers on multi‐scale features, ensuring a stronger feature extraction capability. The extensive experiments have been carried on VOC and MS‐COCO datasets, and the superiority of MSFA over existing mainstream methods has been verified.

show abstract

\%

Section: Resultsmentioning

confidence: 99%

“…With extensive research [26][27][28][29], all of these methods try to strengthen the learning of regional features related to category. However, they neglect the possible impact of category scale diversity and always perform category label identification and supervision at the feature layer of the same scale.…”

Section: Spatial Location Information Utilizationmentioning

confidence: 99%

MSFA: Multi‐stage feature aggregation network for multi‐label image recognition

Chen,

Xu,

Zeng

et al. 2024

IET Image Processing

View full text Add to dashboard Cite

show abstract

“…The GAT model has higher computational efficiency and has been widely applied in various fields, such as social network analysis, image processing, and natural language processing. 13 Recently, the GAT is also used in code defect detection, software security analysis, and code similarity detection. 27 Therefore, we apply the GAT model to explore effective features and predict the location of faults.…”

Section: Gatmentioning

confidence: 99%

“…GAT is based on the transformer model and introduces a masked self-attention mechanism that assigns different weights to the representation of each node according to the different features of its neighboring nodes. 12,13 To address the traditional SBFL and DLFL techniques' limitations, we propose a fault localization approach based on the WEGAT. It abstracts the coverage information of test cases and program elements into an execution graph and analyzes the information of the graph using a GAT.…”

Section: Introductionmentioning

confidence: 99%

Improving fault localization via weighted execution graph and graph attention network

Yan,

Jiang,

Zhang

et al. 2023

J Software Evolu Process

View full text Add to dashboard Cite

Software fault localization is commonly recognized as arduous and time consuming. Spectrum‐based fault localization (SBFL) has been widely used due to its lightness. However, the effectiveness of SBFL is limited since it only considers simple statistics on the coverage information, ignoring the tie problem that the spectrum matrixes of some statements are the same. Most existing deep learning‐based fault localization (DLFL) techniques convert the coverage information into a vector, which utilizes the spectrum in a simplified manner and still has limitations in practice. To solve the above problem, we propose an approach via the weighted execution graph and graph attention network (WEGAT). We use a graph structure to represent the coverage information between test cases and program elements. Then, we generate a weighted execution graph by applying the predicate execution sequence. Furthermore, we combine the weighted execution graph with the AST as an integrated graph, which is the input of the GAT for fault localization. We evaluate WEGAT in within‐project and cross‐project prediction scenarios on the Defects4J benchmark. Experimental results show that our approach outperforms traditional SBFL (Ochiai, DStar and Tarantula) and DLFL (TraPT, CNN‐FL, Grace, and AGFL) methods, effectively improving the accuracy of fault localization.

show abstract

“…Channel attention has shown significant advantages in many image processing tasks. For example, in image classification tasks, channel attention can help the network to better distinguish feature differences between different categories [ 13 , 14 , 15 , 16 , 17 ]. In a target detection task, channel attention improves the network’s ability to accurately locate and recognize targets [ 18 , 19 , 20 , 21 , 22 ].…”

Section: Introductionmentioning

confidence: 99%

A Novel Strategy for Extracting Richer Semantic Information Based on Fault Detection in Power Transmission Lines

Yan,

Li,

Wang

et al. 2023

Entropy

View full text Add to dashboard Cite

With the development of the smart grid, the traditional defect detection methods in transmission lines are gradually shifted to the combination of robots or drones and deep learning technology to realize the automatic detection of defects, avoiding the risks and computational costs of manual detection. Lightweight embedded devices such as drones and robots belong to small devices with limited computational resources, while deep learning mostly relies on deep neural networks with huge computational resources. And semantic features of deep networks are richer, which are also critical for accurately classifying morphologically similar defects for detection, helping to identify differences and classify transmission line components. Therefore, we propose a method to obtain advanced semantic features even in shallow networks. Combined with transfer learning, we change the image features (e.g., position and edge connectivity) under self-supervised learning during pre-training. This allows the pre-trained model to learn potential semantic feature representations rather than relying on low-level features. The pre-trained model then directs a shallow network to extract rich semantic features for downstream tasks. In addition, we introduce a category semantic fusion module (CSFM) to enhance feature fusion by utilizing channel attention to capture global and local information lost during compression and extraction. This module helps to obtain more category semantic information. Our experiments on a self-created transmission line defect dataset show the superiority of modifying low-level image information during pre-training when adjusting the number of network layers and embedding of the CSFM. The strategy demonstrates generalization on the publicly available PASCAL VOC dataset. Finally, compared with state-of-the-art methods on the synthetic fog insulator dataset (SFID), the strategy achieves comparable performance with much smaller network depths.

show abstract

Double Attention Based on Graph Attention Network for Image Multi-Label Classification

Cited by 16 publications

References 54 publications

MSFA: Multi‐stage feature aggregation network for multi‐label image recognition

MSFA: Multi‐stage feature aggregation network for multi‐label image recognition

Improving fault localization via weighted execution graph and graph attention network

A Novel Strategy for Extracting Richer Semantic Information Based on Fault Detection in Power Transmission Lines

Contact Info

Product

Resources

About